Source: Reuters
Recently, a group of authors filed a class action lawsuit in a California federal court against artificial intelligence company Anthropic. Writers Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson allege that the company used pirated versions of their books and hundreds of thousands of other works to train its AI-powered chatbot, Claude. This bot generates texts in response to user requests, which has caused concern in the creative community.
The complaint, filed last Monday, alleges that Anthropic used an open-source dataset known as “The Pile” to train its Claude family of chatbots. Within this dataset is “Books3,” a vast library of pirated eBooks that includes works by Stephen King, Michael Pollan, and thousands of other authors. In early August, Anthropic confirmed to Vox that it used “The Pile” to train Claude.
Earlier this month, Anthropic confirmed that The Pile had been used to train Claude. Although Books3 was removed from the dataset in August last year, the authors claim that the original version of the dataset is still available.
The complaint asserts that the company’s business model seeks to profit from exploiting human creative work. In this regard, the plaintiffs are seeking damages from Anthropic, paying for damages to those affected and compelling the firm not to use the copyrighted content to train Claude.
This case is not unique; it joins other high-profile lawsuits from copyright holders, including visual artists, media, and record labels. These cases have a common denominator: using copyrighted material to train generative artificial intelligence systems.
Finally, it should be noted that this would not be the first lawsuit Anthropic has faced. In October last year, Universal Music Group (UMG), Concord Publishing, and ABKCO Music & Records filed a lawsuit against the AI firm for using “lyrics from numerous musical compositions” to train Claude.
Leave A Comment