FSF Demands AI Models Be Free Software After Anthropic Settlement

The Free Software Foundation has escalated its dispute with Anthropic beyond financial compensation, demanding that the AI developer release its large language models as free software when trained on materials licensed under the GNU Free Documentation License.

The FSF received a settlement notice as part of the copyright infringement lawsuit Bartz v. Anthropic, which saw Anthropic agree in September to create a $1.5 billion fund to compensate authors whose works it had used to train its models without seeking or securing permission. The settlement stemmed from allegations that Anthropic infringed copyright by downloading works in Library Genesis and Pirate Library Mirror datasets for purposes of training large language models.

Robot Lawyer — The Anthropic settlement marks the largest copyright recovery in US history, setting precedent for AI training practices.

The FSF's objection centres on principle rather than simple financial compensation. Among the works the FSF holds copyrights over is Sam Williams's 'Free as in freedom: Richard Stallman's crusade for free software', which was found in datasets used by Anthropic as training inputs. It was published by O'Reilly and by the FSF under the GNU Free Documentation License, which is a free license allowing use of the work for any purpose without payment.

Rather than simply accept the settlement payout, the FSF is pushing for something fundamentally different. The FSF argues that the right thing to do is protect computing freedom by sharing complete training inputs with every user of the LLM, together with the complete model, training configuration settings, and the accompanying software source code, and urges Anthropic and other LLM developers that train models using huge datasets downloaded from the Internet to provide these LLMs to their users in freedom.

The position reflects a core tension in copyright law and open-source philosophy. In the litigation, Anthropic won on some aspects, where Judge William Alsup said it was fair use to use the books to train LLMs, but left open the question of whether downloading them for that purpose was legal. The court ruled that using the books to train LLMs was fair use but left for trial the question of whether downloading them for this purpose was legal. The settlement allowed both parties to avoid the risks of proceeding to trial.

The FSF's leverage in this dispute is limited, however. The FSF noted it didn't have the resources for a protracted legal battle over the issue. The foundation acknowledged this in an update to its original statement, making clear that if the FSF were to participate in a lawsuit such as Bartz v. Anthropic and found its copyright and license violated, it would certainly request user freedom as compensation. This framing suggests the FSF's demands in the current situation represent an aspirational goal rather than a binding requirement.

The broader context matters here. The $1.5 billion settlement in Bartz v. Anthropic is not just the end of a single dispute but the beginning of a new phase in which it is becoming clear that the AI industry will have to rely on an organised system of content licensing instead of informal, often unlawful scraping from the internet. For smaller AI companies, there is a real risk that the shift to expensive and regulated licensing will lead to a situation in which only the largest companies can afford all the necessary licenses and potential settlements, creating a more difficult market entry for smaller startup companies that do not have the capital to pay for licenses and bear legal risk.

The FSF's intervention reflects a genuine philosophical divide about what compensation should look like in the AI training context. Rather than focusing solely on monetary damages, the foundation is arguing that when an AI model is built using materials specifically licensed to ensure freedom and openness, the resulting model should embody those same principles. Whether this argument gains traction with other copyright holders, policymakers, or courts remains uncertain.