- Pascal's Chatbot Q&As
- Posts
- GEMA v. OpenAI: The allegation was that these lyrics had been used to train ChatGPT without a license and that ChatGPT subsequently reproduced them nearly verbatim for users.
GEMA v. OpenAI: The allegation was that these lyrics had been used to train ChatGPT without a license and that ChatGPT subsequently reproduced them nearly verbatim for users.
The court accepted GEMA’s core argument that this constituted unlicensed Vervielfältigung (reproduction) and Wiedergabe (making available/communication to the public) under German copyright law.
GEMA’s Opening Victory Against OpenAI—What Happened, Why It Matters, and How AI Makers Should Respond
by ChatGPT-5
The judgment of the Munich Regional Court (Landgericht München I) in the dispute GEMA v. OpenAI marks a significant and symbolic moment in the rapidly evolving conflict between copyright law and generative AI. Although this is just a first-instance ruling, it signals the willingness of European courts—particularly in Germany, with its strong tradition of protecting authors’ rights—to intervene decisively when they see unauthorized use of protected works in the training or output of AI systems.
What occurred
GEMA, the German collecting society responsible for managing rights and remuneration for songwriters and composers, brought a lawsuit against OpenAI after observing that ChatGPT reproduced substantial parts of copyrighted song lyrics when prompted by users. The case centered on nine iconic German songs, including “Männer” by Herbert Grönemeyer, “Über den Wolken” by Reinhard Mey, and the children’s classic “In der Weihnachtsbäckerei.” The allegation was that these lyrics had been used to train ChatGPT without a license and that ChatGPT subsequently reproduced them nearly verbatim for users.
The court—specifically the 42nd Civil Chamber—accepted GEMA’s core argument that this constituted unlicensed Vervielfältigung (reproduction) and Wiedergabe (making available/communication to the public) under German copyright law. OpenAI was therefore found liable and ordered to pay damages.
Important to note: the judgment is not yet final and is likely to be appealed. Higher courts, possibly the Federal Court of Justice (BGH) or even the CJEU, may ultimately have to clarify fundamental issues.
What the grievances were
GEMA presented three main grievances:
Unauthorised training use:
GEMA argued that OpenAI trained ChatGPT on full copyrighted texts without obtaining a license. Training itself, GEMA claimed, required explicit permission.Near-verbatim reproduction of lyrics:
Evidence showed that ChatGPT could output lyrics that were “almost identical” to the originals. This was deemed proof that OpenAI had ingested and stored those works in ways amounting to unlawful copying.Commercial exploitation without remuneration:
GEMA did not oppose generative AI in principle. Rather, it demanded that rights holders must be paid licensing fees for any use of their works, including training and output.
OpenAI countered with a technical argument: ChatGPT’s outputs were “sequenziell-analytisch, iterativ-probabilistische Synthese”—essentially a probabilistic statistical reconstruction—not stored copies of original works.
The court, however, gave little weight to this explanation, focusing instead on the empirical fact that the system reproduced protected lyrics almost verbatim.
The outcome
The court ruled in favour of GEMA:
OpenAI cannot use copyrighted lyrics without a license.
OpenAI is liable for damages.
The ruling treats generative AI output as a form of reproduction, not abstract statistical synthesis.
The case will likely move through multiple appeals.
The decision fits within a broader landscape of litigation internationally, where authors, musicians, journalists, and other creators are challenging unlicensed training. The article notes that lawsuits are underway both in Europe and the US, and that there is little established case law because generative AI does not fit neatly into existing categories of reproduction or making available.
Do I agree with the judgment?
On balance, yes—given the current state of the evidence and the behavior of the AI outputs, the ruling is justified.
Empirical reality matters.
If a system emits nearly identical copyrighted text in response to straightforward prompts, courts reasonably infer that training involved specific reproductions of the work.Technical explanations are insufficient without safeguards.
OpenAI’s argument about “iterativ-probabilistische Synthese” does not resolve the central issue: end-users received protected lyrics without a license.Germany’s legal tradition gives weight to author’s rights.
Courts in Germany reliably prioritize creators’ moral and economic rights. Under that lens, the ruling aligns with decades of jurisprudence.The power imbalance is real.
GEMA represents individual songwriters who cannot effectively negotiate or enforce rights on their own against a multi-billion-dollar AI company. The court’s decision helps rebalance that asymmetry.
Where I would add nuance: future cases must distinguish between training and output reproduction, because the legal basis and technical mechanisms differ. Not every generative model output should be treated as a copy, but when verbatim reproduction occurs, infringement is evident and should be actionable.
How AI makers can prevent this from happening in the first place
Secure training licenses proactively.
Instead of scraping music lyrics (or allowing datasets containing them), AI developers should negotiate collective licensing agreements with rights-management organizations—similar to how Spotify or YouTube handle music rights.Implement output filters and reproducibility controls.
Block near-verbatim reproduction of copyrighted texts.
Watermark or fingerprint outputs that are too close to training data.
Use robust similarity-detection layers to prevent leakage.
Maintain transparent datasets.
Courts repeatedly punish opacity. AI companies must:document provenance
disclose sources
allow auditing under NDA
Collaborate with rights holders on opt-in and opt-out mechanisms.
Allow creators to submit or exclude their works.
Provide clear reporting.
Pay for inclusion, like any other licensing arrangement.
Shift from “everything on the internet is free to ingest” to responsible curation.
The era of indiscriminate scraping is over. AI’s business model must evolve.Develop technical architectures where memorization is minimized.
Techniques such as:retrieval-augmented generation (with rights-cleared databases)
differential privacy
synthetic corpora can reduce the risk of verbatim reproduction.
Conclusion
GEMA’s opening victory is both a wake-up call and an inevitable milestone. Copyright systems have resisted every technological shift—radio, CDs, streaming—and generative AI is no exception. Until AI companies embrace licensing, transparency, and technical safeguards, courts will continue to intervene. This case—rooted in nine iconic songs—may ultimately reshape global norms for how AI is trained, audited, and regulated. And while appeals will follow, the direction of travel is clear: rights holders expect to be compensated, and courts are increasingly willing to enforce that expectation.
