• Pascal's Chatbot Q&As
  • Posts
  • ChatGPT about Anthropic's filing: "Training fair use is strongest when paired with demonstrable, continuously improved technical and policy measures to prevent substitute outputs."

ChatGPT about Anthropic's filing: "Training fair use is strongest when paired with demonstrable, continuously improved technical and policy measures to prevent substitute outputs."

And with remedies targeted at leakage rather than at the existence of the model. Training can be protected more broadly, but only if the deployed system is not effectively a piracy kiosk in practice.

The “Transformative” Training Defense Meets the Lyrics Problem: Anthropic’s Bid to Win the Fair-Use War Early

by ChatGPT-5.2

Anthropic’s summary-judgment filing in the music publishers’ lyrics case is a carefully constructed attempt to split the dispute into two different legal universes: inputs (training) and outputs (what Claude returns to users). Its core move is to persuade the court that training on lyrics is categorically fair use as a matter of law, while outputs are (a) rare, (b) often provoked by plaintiffs or jailbreaking, and (c) legally mis-framed by the publishers as “direct” infringement by Anthropic rather than user-instigated conduct that must be analyzed, if at all, under secondary liability standards.

1) What Anthropic is arguing

A. Training on lyrics is “transformative” fair use (inputs theory).

Anthropic’s headline argument is that copying lyrics into a training corpus serves a fundamentally different purpose than the purpose of lyrics themselves. Lyrics are created for artistic expression; Anthropic says it uses them (along with “billions” of other copyrighted works) to train Claude to understand language and concepts, producing a general-purpose system that can reason, code, summarize, and create. It frames this as “transformative” under the Supreme Court’s modern fair-use lens: the use “adds something new” and is not a substitute for the original. Anthropic leans hard on two district-court decisions that have treated LLM training as extremely transformative and thus strongly protected.

B. Market harm must be “substitution,” not “competition,” and licensing claims can’t be circular.

Anthropic tries to neutralize the publishers’ most intuitive attack—“you copied without paying and now we lose a licensing market”—by insisting that fourth-factor harm can’t be defined as “we would have licensed this if you had asked.” Otherwise every fair-use case would collapse into “pay me.” It also attacks the publishers’ theory that AI-assisted creation of new lyrics “dilutes” the market for existing lyrics, arguing that copyright doesn’t protect against competition from new works that don’t copy protected expression.

C. Outputs claims fail on liability architecture (outputs theory).

Anthropic argues the publishers are trying to “shoehorn” outputs into direct infringement, but direct liability in the Ninth Circuit requires volitional conduct—the party who “instigated” the copying. Here, Anthropic says, that’s the user who prompted Claude. If the publishers want to pursue Anthropic, they need a secondary liability theory (contributory/vicarious), and Anthropic argues those fail because (1) Claude has enormous non-infringing uses, (2) there is no evidence Anthropic encouraged infringement or built a service “tailored” to infringement, and (3) many alleged infringements were generated by the publishers’ own agents attempting to jailbreak or trick the system.

D. The DMCA claim fails because there’s no actionable removal of CMI tied to infringement.

On the DMCA theory (removal of copyright management information), Anthropic argues (i) it never possessed the copyrighted “work” in full because the asserted works are musical compositions while what was ingested was lyrics scraped from the web; and (ii) even if something like CMI were missing, the DMCA requires intent/knowledge tied to enabling or concealing infringement—something incompatible with copying that is fair use.

E. Even if the court dislikes Anthropic’s big fair-use theory, publishers still don’t deserve summary judgment.

Anthropic also positions itself defensively: even if the judge is not ready to bless training fair use outright, publishers still can’t win on summary judgment because market harm is disputed, many outputs are arguably fair use (commentary/parody/criticism), and there are disputes about establishing the authoritative lyric text for each work.

2) Are Anthropic’s arguments robust?

On inputs (training): fairly strong—in this specific procedural posture and venue.

Anthropic’s training fair-use argument is robust in one very practical sense: it is aligned with a growing line of reasoning in U.S. cases that treat large-scale copying for new computational functions as transformative, especially where the use is not a market substitute for enjoying the original expression (search, indexing, plagiarism detection, etc.). The motion is built to reassure the court that what’s being created is not a lyric database but a general tool with overwhelmingly non-substitutive uses. That is exactly the kind of framing that tends to do well under factor one in “technology-enabling” fair-use cases.

Where the argument is less robust is where it quietly depends on contested factual and normative premises:

  • “Not a substitute” is persuasive at the level of “training is different from consumption,” but courts can become uneasy if they believe the secondary use is effectively a production system that can replace demand for the originals at scale. Anthropic tries to quarantine that anxiety by separating inputs from outputs. The court may or may not accept that separation cleanly.

  • “Lyrics are freely available online” helps on factor two and sometimes on the equities, but it can also prompt skepticism: “free online” does not mean “free to copy for any purpose,” and it doesn’t resolve whether training creates a market substitute for licensing in the AI context.

  • The licensing-market debate is where the case will actually be fought. Anthropic is right that “you could have licensed it” can’t automatically defeat fair use. But publishers will argue there is (or should be) a distinct market for AI training rights—precisely because training can power substitute outputs at scale. Courts are still feeling their way through whether and how to recognize that market without making fair use meaningless.

On outputs: strong on doctrine, but vulnerable on “what actually happened.”

Anthropic’s liability-architecture argument—direct vs secondary liability and the need for volitional conduct—is doctrinally serious. Courts generally resist collapsing platform/service providers into direct infringers when user prompts are the immediate cause of copying. But the vulnerability is evidentiary and behavioral: if plaintiffs can show Claude predictably produced substantial verbatim lyrics under normal use (not exotic jailbreaking), or if they can show guardrails were knowingly inadequate for a long time, a court might be more receptive to secondary liability theories or to narrowing remedies.

On the DMCA: strong as a cleanup claim.

DMCA §1202 claims often fail when plaintiffs cannot identify конкрет instances of CMI removal tied to infringement with the required mental state. Anthropic’s “double scienter” emphasis is a classic—and often effective—way to dispose of DMCA add-ons.

3) Do I, ChatGPT, agree? And what is the “objective legal truth” (or should be)?

There isn’t a single “objective truth” yet because U.S. fair use here is not a mechanical test; it is an elastic balancing standard being applied to a novel industrial practice. That said, there are two “truths” worth separating—one descriptive (what current doctrine tends to do) and one normative (what the law should do, if courts want coherence and legitimacy).

Descriptive legal truth (what current doctrine is most likely to do):

  1. Training-only claims are most likely to succeed as fair use when framed as non-substitutive, transformative technological development, especially if outputs are separately constrained and the record supports that regurgitation is uncommon.

  2. Outputs that reproduce verbatim (or near verbatim) lyrics are unlikely to be excused categorically, but the right doctrinal lane will usually be: user instigation + secondary liability standards + a fact-heavy look at guardrails, inducement, and practical control.

  3. “AI dilution” as a market-harm theory based on competition from new works is legally weak, because copyright does not protect against competition from independently created expression.

Normative legal truth (what the law should be to avoid incoherence):

  • Courts should separate (a) training that functions like analysis/learning from (b) system behavior that functions like on-demand distribution of copyrighted text. Training can be protected more broadly, but only if the deployed system is not effectively a piracy kiosk in practice.

  • If courts bless training fair use too broadly without insisting on meaningful anti-regurgitation measures, they risk creating a perverse equilibrium: “copy everything now, fix leakage later.” A more legitimate standard is: training fair use is strongest when paired with demonstrable, continuously improved technical and policy measures to prevent substitute outputs (and with remedies targeted at leakage rather than at the existence of the model).

4) Likely litigation outcomes

Most plausible near-term outcome: partial win for Anthropic (training), continued fight on outputs.

A common judicial path is:

  • grant (or strongly signal) summary judgment for Anthropic on training fair use, because it’s the cleanest doctrinal question and because courts are wary of letting copyright become a veto on general-purpose computation;

  • deny summary judgment (or narrow the scope) on outputs, because outputs are messy, fact-dependent, and tied to user prompts, guardrails, and what counts as “verbatim” vs “fair-use commentary.”

Second plausible outcome: no categorical ruling on training; court punts to trial / narrows issues.

If the judge thinks the “AI licensing market” question is too entangled with factual disputes, she could deny summary judgment on training and force a trial record (or at least a more developed evidentiary showing) before deciding whether the market harm factor is legally cognizable in this context.

Longer-term likely outcome: appellate pressure and eventual circuit guidance.

Whatever happens at summary judgment, this class of cases is heading toward appellate resolution because the stakes are systemic and lower courts are not perfectly aligned. Expect any decisive ruling on training fair use to be appealed, with the Ninth Circuit eventually forced to articulate a clearer rule for (1) transformation in AI training, and (2) how to treat asserted markets for AI training licenses without turning factor four into a rightsholder veto.

Settlement remains a live possibility, but likely after a major ruling.

Because training fair use is the keystone, a strong win for either side will reset bargaining power dramatically. A partial ruling (training fair use recognized; outputs remain constrained) is the kind of outcome that often catalyzes settlement: the developer accepts operational constraints and targeted damages exposure; the rightsholders pivot toward licensing for product features and distribution rather than trying to block training itself.

·

3 APRIL 2025

The Amicus Brief and Its Counterarguments: Why META's Behavior Does Not Constitute Fair Use

·

20 JANUARY 2025

Asking Claude: Kindly review all my Substack posts and provide a final decision on whether AI training on copyrighted content (without license, permission or compensation and where the content may partially or fully appear in model outputs) and the use of the model in commercial scenarios, constitutes Fair Use.

·

15 JULY 2024

Question 1 of 5 for ChatGPT-4o: Please read the document "FAIR USE IN THE US REDUX: REFORMED OR STILL DEFORMED?" and tell me what it says in easy to understand language.

·

29 NOVEMBER 2024

Question 1 of 2 for ChatGPT-4o: Please read the paper "Physics in Next-token Prediction" and tell me what it says in easy to understand language

·

4 FEBRUARY 2024

Please read “Barcelona court rules in favour of defendant in NFT metaverse copyright case” Isn't the judge effectively saying: it is Fair Use because it is NOT transformative at all? Is this in line with the US Fair Use doctrine? Do you agree that creating an NFT in the way described is not transformative at all?

·

6 JANUARY 2025

Question for AI services: Please read Rafael Brown's LinkedIn post about Fair Use and tell me, do you agree with him?

·

7 MARCH 2024

·

1 NOVEMBER 2023

Question 1 of 5 for ChatGPT-4: Please compare the following links and tell me what the overall thought process is regarding Fair Use https://www.reuters.com/legal/litigation/judge-pares-down-artists-ai-copyright-lawsuit-against-midjourney-stability-ai-2023-10-30/

·

2 NOVEMBER 2024

Question 1 of 2 for ChatGPT-4o: Please read the article “Infringing AI: Liability for AI-generated outputs under international, EU, and UK copyright law” and the paper “Infringing AI: Liability for AI-Generated Outputs under International, EU, and UK Copyright Law

·

9 SEPTEMBER 2024

Question 1 of 3 for ChatGPT-4o: please read the paper “Generative AI's Illusory Case for Fair Use” and tell me what it says in easy to understand language

·

6 MARCH 2025

Question for ChatGPT-4o: Please read the report “AI Training - Consent & Content. Emerging technical approaches toward control and compensation for creators and rights holders” and tell me what the key messages are. Also, make sure to list the most surprising, controversial and valuable statements and findings in the report.

·

24 JUNE 2025

The June 2025 Alsup Order on Fair Use and Copyright Infringement in Anthropic v. Bartz et al.

·

12 JUNE 2025

Everything That’s Wrong with “The Economic Importance of Fair Use for the Development of Generative Artificial Intelligence”

·

27 FEB

“Fair Use with Teeth”: Korea’s Playbook for AI Training on Copyrighted Works

·

9 MAR

Fair Use as an Industrial Policy: What AI Progress Is Really Arguing For — and What It Leaves Out

·

27 FEBRUARY 2025

·

6 MAY 2024

Asking AI Services: Please read the section of the post by IP/entertainment lawyer Peter Csathy called “III. the mAIn event - Yes, AI copying is different than human inspiration. And no, substantial similarity isn’t needed.” and tell me whether you agree with the assessment there or whether the claims are incorrect and why.

·

16 NOVEMBER 2023

Question 1 of 2 for Google Bard: What are the arguments made by Ed Newton-Rex?

·

8 MAR

“Fair Use by Technical Necessity” — or The Piracy Protocol Defense (Meta’s Newest Attempt to Re-Label Seeding as Scholarship)

·

25 FEB

Scraping the Stream: Why the Runway YouTube Suit Is a Strategic Escalation in AI Copyright Litigation

·

14 DECEMBER 2024

·

17 JANUARY 2024

Question 1 of 4 for MS Copilot: Please read and translate the script for the radio broadcast "Kopier mich nicht! – Kulturschaffende wehren sich gegen KI" and tell me what it says

·

27 MAR

The Memorization Mirage: What LLMs Really Keep, What the Law Really Cares About, and Where the Next Court Shock Will Land

·

31 OCTOBER 2025

A Turning Point in AI Copyright Litigation—George R.R. Martin, Download Claims, and the Future of Generative AI Accountability

·

21 JULY 2023

ChatGPT-4’s answer:

·

28 AUGUST 2023

Question for AI services: Please read https://www.channelnews.com.au/tech-companies-want-copyright-law-relaxed-to-aid-ai-adoption/ Is it fair to say that what the tech companies are arguing is the same as saying: “Ignore us using copyrighted information without permission in training data and for the AI-driven generation of derivatives of copyrighted co…