A recent copyright lawsuit involving Anthropic’s Claude AI has taken an unexpected turn — not because of a groundbreaking legal precedent, but because of a hallucinated citation generated by the AI model itself.
Anthropic’s lawyers were forced to issue a public apology after Claude fabricated a nonexistent legal precedent in a brief submitted as part of its defense. The false citation claimed to support the idea that AI training on copyrighted lyrics falls under fair use. However, the court and opposing counsel quickly discovered that the cited case didn’t exist — a striking example of how AI-generated content, when used without verification, can derail serious legal arguments.
The Actual Copyright Question: Can AI Train on Lyrics?
At the heart of the lawsuit is a vital and unresolved question: Does using copyrighted song lyrics to train an AI model constitute copyright infringement?
From a legal standpoint, copyright is about copying and distribution — the exclusive right of the creator to control reproduction and public sharing of their work. The plaintiffs argue that by feeding copyrighted lyrics into its large language model, Anthropic is engaging in unauthorized copying, and thus violating copyright law, even if the lyrics are never reproduced verbatim.
Anthropic (and others in the AI field) argue that training is not the same as copying for distribution. Their claim is that models don’t memorize and regurgitate copyrighted works — they learn statistical patterns. Much like a human songwriter might absorb the feel of Dylan or Beyoncé and then write something original, an AI model is learning structure, meter, rhyme, and tone — not storing and reproducing the lyrics themselves, unless prompted in ways that coax it to do so.
FREEARTISTS.ORG’s position is
“Training does not equate to copyright. Training isn’t copying and distributing.”
That’s a crucial distinction. If training a model involves ingesting material without creating a usable copy for distribution or display, is it really an infringement? Or is it closer to reading and learning — acts that are not, in themselves, protected by copyright law?
How the Hallucination Muddies the Waters
Rather than helping resolve that question, Claude’s fabricated citation has become a distraction. The court is now bogged down with questions about the credibility of the AI’s legal output and whether Anthropic’s legal team performed due diligence. It draws attention away from the real issue: how do we define and regulate the use of copyrighted works in model training — especially when models become capable of mimicking or even reconstructing parts of those works?
This legal misstep could unfortunately delay or derail a substantive ruling on whether AI training on lyrics constitutes fair use or infringement. The music industry and AI developers alike are watching closely, because the outcome of this case could shape how models are built, how datasets are sourced, and how future creativity is regulated.
Why This Matters
If the court ultimately rules that training on copyrighted material without a license is infringement, it could:
- Require all AI developers to license massive datasets, including song lyrics
- Put small or open-source models at a severe disadvantage
- Threaten academic and non-commercial research
On the other hand, a ruling in favor of Anthropic could:
- Affirm that learning from information is not the same as copying it
- Reinforce a broader interpretation of fair use
- Trigger new licensing frameworks, where outputs — not training — are the point of concern
In short, this case was poised to address the foundational debate over copyright in the age of AI, but it’s now veering off course because of a machine-generated fiction. If we’re going to meaningfully navigate the ethics and legality of AI training, we’ll need to keep the focus on intent, outcome, and use, not just the inputs — and certainly not on AI’s made-up legal history.

