- Pascal's Chatbot Q&As
- Posts
- GEMA, representing around 100,000 songwriters and composers, claims that OpenAI trained ChatGPT on copyrighted German song lyrics without acquiring licenses or compensating authors.
GEMA, representing around 100,000 songwriters and composers, claims that OpenAI trained ChatGPT on copyrighted German song lyrics without acquiring licenses or compensating authors.
The evidence centers on ChatGPT’s ability, when prompted, to reproduce original texts by artists such as Reinhard Mey, Inga Humpe, Rolf Zuckowski, and Kristina Bach.
The GEMA vs. OpenAI Case – Copyright, Creativity, and the Future of AI Training
by ChatGPT-4o. Source: Frankfurter Allgemeine Zeitung
In late September 2025, a historic moment unfolded in courtroom 134 of the Munich Justizpalast. For the first time, a major rights society, Germany’s GEMA (Gesellschaft für musikalische Aufführungs- und mechanische Vervielfältigungsrechte), brought a case against OpenAI, alleging systematic copyright infringement in the training and outputs of ChatGPT. The clash is not just about lyrics — it is about who gets to define the rules of cultural participation in the AI era.
The Core Allegations
GEMA, representing around 100,000 songwriters and composers, claims that OpenAI trained ChatGPT on copyrighted German song lyrics without acquiring licenses or compensating authors. The evidence centers on ChatGPT’s ability, when prompted, to reproduce original texts by artists such as Reinhard Mey, Inga Humpe, Rolf Zuckowski, and Kristina Bach (author of Helene Fischer’s iconic “Atemlos durch die Nacht”).
The society argues:
Direct reproduction: ChatGPT can deliver full, unaltered lyrics — evidence of unauthorized copying.
Systematic use: OpenAI knowingly ingested vast amounts of protected works, disregarding licensing norms.
Commercial exploitation: With ChatGPT generating $3.7 billion in 2024 revenues, GEMA insists this is not research or educational use but a clear for-profit venture.
Rightsholder opt-out ignored: German authors exercised their legal ability to block such uses via GEMA’s blanket opt-out. Training on these works was therefore unlawful from the outset.
OpenAI’s Defense
OpenAI rejects the claims, advancing several lines of argument:
No direct copying: The system “reflects” statistical patterns rather than storing and regurgitating lyrics verbatim. When lyrics appear, it is incidental or slightly altered.
Misunderstanding of AI: GEMA, OpenAI argues, misinterprets how generative AI works. Models do not keep databases of texts but compress training data into probabilistic patterns.
Safeguards exist: ChatGPT is designed to reject requests for verbatim song lyrics, signaling respect for copyright boundaries.
Constructive engagement: The company points to ongoing negotiations with rights holders worldwide as evidence of its commitment to equitable solutions.
This defense leans heavily on technical explanations of LLM behavior and the notion that output similarity does not necessarily prove infringement. Yet, GEMA’s examples of near-identical reproduction undermine the “purely transformative” framing.
Legal Intricacies
The case sits at the intersection of copyright law and new technologies, raising thorny questions:
Scope of text and data mining (TDM) exceptions:
EU and German law allow limited use of copyrighted works for research and non-commercial purposes. But OpenAI’s commercial revenues strip away this shield. Whether TDM exceptions can apply to generative AI is unsettled and highly controversial.Threshold for infringement:
Must plaintiffs prove the AI reproduces works verbatim? Or is training itself, even without reproduction, an infringement? GEMA argues the act of training without consent is already unlawful.Role of opt-outs:
EU copyright rules permit rights holders to block text/data mining. GEMA formally did so for its catalog. If the court validates this opt-out, it would mean OpenAI had explicit notice and ignored it — a damning finding.Judicial signals:
Judge Elke Schwager indicated sympathy with GEMA’s arguments, suggesting that German courts may treat the case not as a grey area but as a straightforward violation. However, she retains the option of referring questions to the European Court of Justice (ECJ), where a Hungarian publisher’s case against Google is already pending. A ruling from Luxembourg could harmonize EU law on this issue.Comparisons to U.S. law:
In America, similar cases (e.g., Universal/Concord/ABKCO vs. Anthropic) hinge on the “fair use” doctrine. Europe lacks such a broad carveout. German law is stricter, with narrower exceptions. This makes GEMA’s chances stronger than those of their U.S. counterparts.
GEMA’s Chances of Success
GEMA enters this battle from a position of strength:
Judicial leanings: The presiding judge already hinted that GEMA’s core arguments are persuasive.
Strong evidence: Demonstrations of ChatGPT outputting full lyrics are powerful, easy-to-understand evidence for a court.
Clear opt-out: The fact that GEMA explicitly opted its repertoire out of text/data mining undermines OpenAI’s defenses.
Public and political climate: European policymakers and courts have shown consistent willingness to prioritize authors’ rights in the face of Big Tech expansion.
The main uncertainty lies in whether the case remains at the Munich court or escalates to the ECJ. A referral could delay resolution but might deliver a binding precedent for the entire EU. If decided in Munich, an injunction against OpenAI in Germany is highly plausible.
Implications Beyond Germany
Should GEMA prevail, ripple effects will be enormous:
Licensing regimes become unavoidable: AI companies would be compelled to negotiate licenses with collecting societies — similar to streaming platforms.
Costs of training rise: Accessing high-quality cultural content would require upfront payments, potentially restructuring AI economics.
Global fragmentation: Different outcomes in the EU and U.S. could lead to divergent AI ecosystems — one license-heavy, one more permissive under fair use.
The case is also closely tied to parallel GEMA suits against Suno and Udio, which generate entire songs rather than just lyrics. These expand the scope from text to full compositions, raising the stakes even higher.
What OpenAI and Others Must Do
To avoid further litigation and regulatory risk, AI companies need to adapt quickly:
Engage in blanket licensing:
Proactively seek deals with collecting societies (GEMA, PRS, SACEM, ASCAP, BMI) to cover training use. This would mirror the model of Spotify and YouTube.Adopt transparent datasets:
Publicly disclose training sources and allow rightsholders to verify whether their works are included. Transparency reduces suspicion and legal exposure.Develop opt-in frameworks:
Instead of relying on contested opt-outs, offer mechanisms for authors to grant permission in exchange for compensation.Build technical safeguards:
Strengthen filters that block verbatim reproduction of copyrighted works. Demonstrating best-effort compliance could mitigate damages.Pursue collective negotiations:
Individual lawsuits are inefficient. A collective framework — perhaps under EU supervision — could set standardized fees for training and output royalties.Rethink business models:
Stop assuming that free scraping equals free innovation. The AI industry must transition to licensing as a cost of doing business, just as music and film services did.
Conclusion
The GEMA vs. OpenAI case crystallizes a central tension of the AI age: the hunger for data versus the rights of creators. GEMA’s legal footing is unusually strong, supported by explicit opt-outs, direct examples of reproduction, and a court sympathetic to its claims. While OpenAI may argue technical nuance and misunderstanding, such defenses are unlikely to resonate with judges seeking clear lines of liability.
The likely outcome is a turning point: either a court ruling or an ECJ referral that forces AI companies into structured licensing. For GEMA, success would not only secure fair remuneration for its members but also reshape the global debate on AI and copyright. For OpenAI and its peers, the path forward is clear: negotiate, license, and adapt — or risk being locked out of key markets.
