Pascal's Chatbot Q&As
Posts
This ruling allows a group of authors and copyright owners to move forward together in a class action lawsuit against Anthropic, the AI company behind Claude...

This ruling allows a group of authors and copyright owners to move forward together in a class action lawsuit against Anthropic, the AI company behind Claude...

...for allegedly downloading and using millions of pirated books to train its AI models without permission or payment. It clears the way for a large-scale lawsuit that could lead to major damages...

Pascal Hetzscholdt
July 17, 2025

AI Piracy on Trial: How the Court’s Class Certification Ruling Against Anthropic Sets the Stage for a Copyright Reckoning

by ChatGPT-4o

Overview for Non-Legal, Non-Technical Readers

On July 17, 2025, the U.S. District Court for the Northern District of California issued a significant ruling in the copyright infringement lawsuit Bartz et al. v. Anthropic PBC. This ruling allows a group of authors and copyright owners to move forward together in a class action lawsuit against Anthropic, the AI company behind Claude, for allegedly downloading and using millions of pirated books to train its AI models without permission or payment.

The case centers on whether Anthropic broke copyright law by copying books from illegal sources like LibGen and PiLiMi—massive pirate libraries that host scanned or stolen eBooks.

The court has now certified a class of copyright holders—meaning they can jointly sue Anthropic—so long as their books were:

Downloaded by Anthropic from LibGen or PiLiMi,
Registered with the U.S. Copyright Office in time,
And the copyright owner holds the right to make copies (a key part of copyright law).

This does not yet decide if Anthropic is guilty, but it clears the way for a large-scale lawsuit that could lead to major damages or a landmark settlement.

What Was Decided

Judge William Alsup ruled that:

A class action is appropriate: The case presents a common legal issue for many affected book authors and publishers—whether Anthropic illegally copied their books from known pirate sites.
The class will include copyright owners of books from LibGen and PiLiMi only: Books from a third pirate site (Books3) and books Anthropic later scanned legally were excluded due to insufficient metadata or unclear legal status.
Ownership of copyright matters: Both legal and beneficial owners (e.g., an author who assigned rights to a publisher but still earns royalties) are included, provided they owned the right to make copies of the book.
Anthropic kept these pirated books and used them: The court found it credible that Anthropic knew it was using pirated content, even organizing its file library to avoid duplicates and referencing book metadata like ISBNs (book barcodes) to manage its AI training datasets.
A clear process must now follow: Plaintiffs must deliver a detailed list of all affected works by September 1, 2025, matching books with copyright registration numbers and ownership details.

Why This Matters

For people outside the legal world, this is the first time a U.S. court has greenlit a class-action lawsuit specifically accusing an AI company of building its models using pirated content at this scale. This ruling sends a strong signal to the industry that:

Copyright still matters—even in the age of AI.
AI companies must be able to prove where their training data came from.
If they relied on stolen content, they could owe massive damages.

What Comes Next

Plaintiffs will submit a finalized list of pirated books used by Anthropic.
A jury trial could follow, focusing on whether Anthropic violated copyright law and how much it owes.
Notice will go out to copyright owners—authors, publishers, and rights holders—who can then decide to join or opt out.
Damages could be massive: Each work copied could lead to statutory damages (up to $150,000 per work in willful cases), although only one recovery is allowed per book, regardless of how many copies were made.

Broader Implications

This case will likely become a test case for other lawsuits against AI companies (like OpenAI or Meta) accused of using copyrighted material without licenses.

Expect the following shifts:

Regulators may now act faster, since the courts are opening the door.
AI companies will face increased pressure to show how their training data is collected, pushing them toward licensed datasets.
Authors and publishers are empowered to protect their work, either by joining this lawsuit or negotiating licensing deals proactively.
Transparency in AI training will become a competitive and legal requirement.

Key Lessons & Recommendations

For Regulators:

Enforce metadata transparency rules for training datasets.
Support copyright registration systems that integrate with AI audit logs.

For AI Developers:

Avoid datasets sourced from pirated material—no matter how tempting or “publicly available” they seem.
Develop internal processes to track provenance and licensing of all training inputs.

For Authors and Publishers:

Register copyrights promptly—this case only includes registered works.
Monitor your titles in AI-generated content or output to detect misuse.
Consider joining collective actions or rights management initiatives to negotiate AI licenses.

Conclusion

This ruling is not just about pirated books. It’s about whether generative AI companies can build billion-dollar models by feeding on stolen work—and whether the legal system is ready to step in. The court’s answer is clear: they can be held accountable. And the industry is now on notice.