Anthropic’s $1.5-billion settlement signals new era for AI and artists

News Desk

2 months ago

Occasional Digest - a story for you

Chatbot builder Anthropic agreed to pay $1.5 billion to authors in a landmark copyright settlement that could redefine how artificial intelligence companies compensate creators.

The San Francisco-based startup is ready to pay authors and publishers to settle a lawsuit that accused the company of illegally using their work to train its chatbot.

Anthropic developed an AI assistant named Claude that can generate text, images, code and more. Writers, artists and other creative professionals have raised concerns that Anthropic and other tech companies are using their work to train their AI systems without their permission and not fairly compensating them.

As part of the settlement, which the judge still needs to be approve, Anthropic agreed to pay authors $3,000 per work for an estimated 500,000 books. It’s the largest settlement known for a copyright case, signaling to other tech companies facing copyright infringement allegations that they might have to pay rights holders eventually as well.

Meta and OpenAI, the maker of ChatGPT, have also been sued over alleged copyright infringement. Walt Disney Co. and Universal Pictures have sued AI company Midjourney, which the studios allege trained its image generation models on their copyrighted materials.

“It will provide meaningful compensation for each class work and sets a precedent requiring AI companies to pay copyright owners,” said Justin Nelson, a lawyer for the authors, in a statement. “This settlement sends a powerful message to AI companies and creators alike that taking copyrighted works from these pirate websites is wrong.”

Last year, authors Andrea Bartz, Charles Graeber and Kirk Wallace Johnson sued Anthropic, alleging that the company committed “large-scale theft” and trained its chatbot on pirated copies of copyrighted books.

U.S. District Judge William Alsup of San Francisco ruled in June that Anthropic’s use of the books to train the AI models constituted “fair use,” so it wasn’t illegal. But the judge also ruled that the startup had improperly downloaded millions of books through online libraries.

Fair use is a legal doctrine in U.S. copyright law that allows for the limited use of copyrighted materials without permission in certain cases, such as teaching, criticism and news reporting. AI companies have pointed to that doctrine as a defense when sued over alleged copyright violations.

Anthropic, founded by former OpenAI employees and backed by Amazon, pirated at least 7 million books from Books3, Library Genesis and Pirate Library Mirror, online libraries containing unauthorized copies of copyrighted books, to train its software, according to the judge.

It also bought millions of print copies in bulk and stripped the books’ bindings, cut their pages and scanned them into digital and machine-readable forms, which Alsup found to be in the bounds of fair use, according to the judge’s ruling.

In a subsequent order, Alsup pointed to potential damages for the copyright owners of books downloaded from the shadow libraries LibGen and PiLiMi by Anthropic.

Although the award was massive and unprecedented, it could have been much worse, according to some calculations. If Anthropic were charged a maximum penalty for each of the millions of works it used to train its AI, the bill could have been more than $1 trillion, some calculations suggest.

Anthropic disagreed with the ruling and didn’t admit wrongdoing.

“Today’s settlement, if approved, will resolve the plaintiffs’ remaining legacy claims,” said Aparna Sridhar, deputy general counsel for Anthropic, in a statement. “We remain committed to developing safe AI systems that help people and organizations extend their capabilities, advance scientific discovery, and solve complex problems.”

The Anthropic dispute with authors is one of many cases where artists and other content creators are challenging the companies behind generative AI to compensate for the use of online content to train their AI systems.

Training involves feeding enormous quantities of data — including social media posts, photos, music, computer code, video and more — to train AI bots to discern patterns of language, images, sound and conversation that they can mimic.

Some tech companies have prevailed in copyright lawsuits filed against them.

In June, a judge dismissed a lawsuit authors filed against Facebook parent company Meta, which also developed an AI assistant, alleging that the company stole their work to train its AI systems. U.S. District Judge Vince Chhabria noted that the lawsuit was tossed because the plaintiffs “made the wrong arguments,” but the ruling didn’t “stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful.”

Trade groups representing publishers praised the Anthropic settlement on Friday, noting it sends a big signal to tech companies that are developing powerful artificial intelligence tools.

“Beyond the monetary terms, the proposed settlement provides enormous value in sending the message that Artificial Intelligence companies cannot unlawfully acquire content from shadow libraries or other pirate sources as the building blocks for their models,” said Maria Pallante, president and chief executive of the Association of American Publishers in a statement.

The Associated Press contributed to this report.

Source link