• Pascal's Chatbot Q&As
  • Posts
  • Meta employees were apparently discussing and implementing the systematic removal of copyright management information (CMI) from works in the LibGen dataset that was being used for AI training

Meta employees were apparently discussing and implementing the systematic removal of copyright management information (CMI) from works in the LibGen dataset that was being used for AI training

From a legal risk perspective, having employees openly discuss and document systematic copyright management information removal is particularly problematic as it could help establish willful violation

Question for Claude: Please read the article "Exhibits show Meta employees apparently discussing removal of copyright management information from materials from some works in dataset(s)" and tell me what happened and exactly how 'unwise' this is...

Claude’s answer:
Based on the article, Meta employees were apparently discussing and implementing the systematic removal of copyright management information (CMI) from works in the LibGen dataset that was being used for AI training. Here are the key concerning elements:

  1. The explicit discussions between Meta employees (particularly Nikolay Bashlykov and Lauren van der Maaten) about removing copyright information:

  • They discussed removing "copyright paragraphs from beginning and end of the paper"

  • They planned to remove any lines containing "ISBN," "Copyright," "(C)," "All rights reserved"

  • Van der Maaten commented this would give them a "nice little boost"

  1. This appears to be particularly unwise because:

  • It potentially violates DMCA Section 1202, which explicitly prohibits removal of copyright management information. The article includes the full legal definition showing this includes:

    • Copyright notices

    • Author information

    • Title and work identification

    • Terms and conditions

    • ISBN and other identifying numbers

  • The employees appeared to be documenting their intent in writing, with statements that make it clear they knew they were deliberately removing copyright information

  • The removal wasn't incidental - it was systematic and purposeful, with employees discussing the best ways to strip this information out

The level of "unwise" here is quite high because:

  • It creates clear documentary evidence of potentially illegal activity

  • It appears premeditated and systematic rather than accidental

  • The employees explicitly acknowledged they were removing information that "shouldn't be there"

  • They documented this in corporate communications that could be (and were) discovered in legal proceedings

  • This has now led to a DMCA Section 1202 claim being added to the ongoing litigation against Meta

From a legal risk perspective, having employees openly discuss and document systematic copyright management information removal is particularly problematic as it could help establish willful violation rather than inadvertent removal.

These revelations seem likely to strengthen the plaintiffs' case significantly, as they have direct evidence of Meta employees discussing the deliberate removal of legally protected copyright management information, with apparent awareness that this information shouldn't be removed.