Five Publishers and Scott Turow Sue Meta and Zuckerberg Over Llama Training on Pirated Books
Elsevier, Hachette, and three other major publishers filed a class action in New York alleging Meta torrented pirated books to train Llama, with Zuckerberg personally approving the strategy.
Editor's Note ·
- Correction:
- The article's 'What We Know' section states 'According to The Next Web, court records indicate that Meta employees torrented roughly 82 terabytes of pirated material.' In The Next Web source, the 82 terabytes figure (citing Tom's Hardware) refers to piracy volume established in the prior Kadrey v. Meta case filed in 2023, not to the new Elsevier et al. lawsuit filed on May 5, 2026. The new publisher complaint, per Variety, alleges Meta and Zuckerberg authorized the torrenting of over 267 terabytes of pirated material.
Overview
Five major publishing houses and bestselling thriller writer Scott Turow filed a proposed class action lawsuit in federal court in New York on May 5, 2026, accusing Meta Platforms and its chief executive Mark Zuckerberg of committing what the complaint describes as one of the most massive copyright infringements in history. According to Variety, the plaintiffs allege that Meta “illegally torrented millions of copyrighted books and journal articles” and “downloaded unauthorized web scrapes” from pirate libraries to build its Llama large language models. The complaint further claims Zuckerberg “personally authorized and actively encouraged the infringement.”
What We Know
The plaintiffs — Elsevier Inc., Cengage Learning, Inc., Hachette Book Group, Inc., Macmillan Publishing Group, LLC, and McGraw Hill LLC — along with author Scott Turow filed the case, captioned Elsevier Inc. et al. v. Meta Platforms, Inc. and Mark Zuckerberg, in the U.S. District Court for the Southern District of New York, as reported by Hachette Book Group.
According to The Next Web, court records indicate that Meta employees torrented roughly 82 terabytes of pirated material, obtaining books from repositories including LibGen, Z-Library, and Anna’s Archive. Mark Zuckerberg personally approved using LibGen, despite internal warnings that it was a “data set we know to be pirated.”
The complaint alleges a deliberate strategic choice. According to Variety, Meta had briefly explored a licensed route, with internal discussions between January and April 2023 about increasing its “dataset licensing” budget to as much as $200 million. Then, in April 2023, Meta “abruptly stopped its licensing strategy,” and an internal note warned that “if we license once single book, we won’t be able to lean into the fair use strategy.” A December 13, 2023, internal memo cited by Variety identified LibGen as “a dataset we know to be pirated.”
The complaint also alleges that Meta stripped copyright management information from the acquired works in order to “conceal its training sources,” per Variety. According to CBS News, the lawsuit further claims AI outputs in some cases reproduce original works verbatim and mirror authors’ personal writing styles.
The publishers are seeking monetary damages and injunctive relief including, notably, the destruction of infringing training data copies, according to Hachette Book Group.
The Licensing Paradox
The publishers’ complaint draws a sharp contrast between Meta’s conduct toward book publishers and its behavior toward news organizations. According to The Next Web, since 2023 Meta has signed licensing deals with Reuters, CNN, Fox News, People Inc., and USA Today — establishing an existing market that Meta selectively bypassed when it came to books and academic journals.
The plaintiffs argue that this selective licensing approach, combined with evidence of Zuckerberg’s direct involvement in the decision to stop pursuing book licensing, distinguishes the new case from previous lawsuits. The Hachette Book Group press release quoted Hachette CEO David Shelley as saying that “Copyright is the bedrock of all creative industries” and that Meta “chose not to compensate rights holders.”
Legal Landscape and Prior Rulings
The new case arrives in a complicated legal environment for AI copyright claims. A previous lawsuit by a group of authors — including Richard Kadrey, Sarah Silverman, and Junot Díaz — raised similar allegations against Meta in the Northern District of California (case No. 3:23-CV-03417). According to Norton Rose Fulbright, on June 25, 2025, the court granted Meta’s motion for partial dismissal on fair use grounds, determining that LLM training constitutes fair use regardless of whether materials came from legitimate or illegitimate sources. The ruling, however, only affected the rights of those thirteen authors, and proceedings in that case continued into late 2025 with amended complaints adding contributory infringement claims.
Variety notes that the Kadrey decision rejected similar claims from 13 authors on fair use grounds. The publisher group’s case, however, is structured to address the weaknesses the court identified: it covers the entire catalogs of five major publishers, can offer direct evidence of market harm — particularly for academic and educational texts whose revenue can be traced to student learning substitution — and benefits from the documented licensing decisions Meta made elsewhere.
Meanwhile, as CBS News reports, the broader AI copyright landscape shifted significantly when Anthropic agreed to a $1.5 billion settlement with authors in the Bartz v. Anthropic case, described as the largest copyright infringement payout in U.S. history. According to the Authors Guild, the settlement covers rightsholders of approximately 500,000 titles, with final approval hearing scheduled for May 14, 2026.
This newsroom previously reported on the escalating discovery battles in AI copyright cases, including a federal court order compelling OpenAI to produce over 100 million ChatGPT conversation logs.
Meta’s Response
Meta rejected the lawsuit’s framing. A Meta spokesperson, as quoted by Variety and CBS News, stated that “courts have rightly found that training AI on copyrighted material can qualify as fair use” and that the company will “fight this lawsuit aggressively.”
What We Don’t Know
The complaint does not disclose a specific dollar amount being sought in damages. No trial date has been set, and the case must first proceed through class certification — a process that The Next Web indicates could take 18 to 24 months before reaching summary judgment or trial.
Whether the publishers’ case will survive the fair-use rulings that blocked earlier author lawsuits remains a central open question. The core legal theory — that Meta’s selective licensing of news content while torrenting books demonstrates willfulness and the existence of a viable licensing market — has not yet been tested by a court.