Anthropic wins ruling on AI training in copyright lawsuit but must face
trial on pirated books
[June 25, 2025]
By MATT O'BRIEN
In a test case for the artificial intelligence industry, a federal judge
has ruled that AI company Anthropic didn’t break the law by training its
chatbot Claude on millions of copyrighted books.
But the company is still on the hook and must now go to trial over how
it acquired those books by downloading them from online “shadow
libraries” of pirated copies.
U.S. District Judge William Alsup of San Francisco said in a ruling
filed late Monday that the AI system's distilling from thousands of
written works to be able to produce its own passages of text qualified
as “fair use” under U.S. copyright law because it was “quintessentially
transformative.”
“Like any reader aspiring to be a writer, Anthropic’s (AI large language
models) trained upon works not to race ahead and replicate or supplant
them — but to turn a hard corner and create something different,” Alsup
wrote.
But while dismissing a key claim made by the group of authors who sued
the company for copyright infringement last year, Alsup also said
Anthropic must still go to trial in December over its alleged theft of
their works.
“Anthropic had no entitlement to use pirated copies for its central
library,” Alsup wrote.

A trio of writers — Andrea Bartz, Charles Graeber and Kirk Wallace
Johnson — alleged in their lawsuit last summer that Anthropic's
practices amounted to “large-scale theft," and that the San
Francisco-based company “seeks to profit from strip-mining the human
expression and ingenuity behind each one of those works.”
Books are known to be important sources of the data — in essence,
billions of words carefully strung together — that are needed to build
large language models. In the race to outdo each other in developing the
most advanced AI chatbots, a number of tech companies have turned to
online repositories of stolen books that they can get for free.
Documents disclosed in San Francisco's federal court showed Anthropic
employees' internal concerns about the legality of their use of pirate
sites. The company later shifted its approach and hired Tom Turvey, the
former Google executive in charge of Google Books, a searchable library
of digitized books that successfully weathered years of copyright
battles.
[to top of second column]
|

The Anthropic website and mobile phone app are shown in this photo,
in New York, July 5, 2024. (AP Photo/Richard Drew, File)
 With his help, Anthropic began
buying books in bulk, tearing off the bindings and scanning each
page before feeding the digitized versions into its AI model,
according to court documents. But that didn't undo the earlier
piracy, according to the judge.
“That Anthropic later bought a copy of a book it earlier stole off
the internet will not absolve it of liability for the theft but it
may affect the extent of statutory damages,” Alsup wrote.
The ruling could set a precedent for similar lawsuits that have
piled up against Anthropic competitor OpenAI, maker of ChatGPT, as
well as against Meta Platforms, the parent company of Facebook and
Instagram.
Anthropic — founded by ex-OpenAI leaders in 2021 — has marketed
itself as the more responsible and safety-focused developer of
generative AI models that can compose emails, summarize documents
and interact with people in a natural way.
But the lawsuit filed last year alleged that Anthropic’s actions
“have made a mockery of its lofty goals” by building its AI product
on pirated writings.
Anthropic said Tuesday it was pleased that the judge recognized that
AI training was transformative and consistent with “copyright’s
purpose in enabling creativity and fostering scientific progress.”
Its statement didn't address the piracy claims.
The authors' attorneys declined comment.
All contents © copyright 2025 Associated Press. All rights reserved
 |