Mitch Stoltz's BestNetTech Profile

Mitch Stoltz

About Mitch Stoltz

Posted on BestNetTech - 21 May 2025 @ 12:46pm

The U.S. Copyright Office’s Draft Report On AI Training Errs On Fair Use

Within the next decade, generative AI could join computers and electricity as one of the most transformational technologies in history, with all of the promise and peril that implies. Governments’ responses to GenAI—including new legal precedents—need to thoughtfully address real-world harms without destroying the public benefits GenAI can offer. Unfortunately, the U.S. Copyright Office’s rushed draft report on AI training misses the mark.

The Report Bungles Fair Use

Released amidst a set of controversial job terminations, the Copyright Office’s report covers a wide range of issues with varying degrees of nuance. But on the core legal question—whether using copyrighted works to train GenAI is a fair use—it stumbles badly. The report misapplies long-settled fair use principles and ultimately puts a thumb on the scale in favor of copyright owners at the expense of creativity and innovation.

To work effectively, today’s GenAI systems need to be trained on very large collections of human-created works—probably millions of them. At this scale, locating copyright holders and getting their permission is daunting for even the biggest and wealthiest AI companies, and impossible for smaller competitors. If training makes fair use of copyrighted works, however, then no permission is needed.

Right now, courts are considering dozens of lawsuits that raise the question of fair use for GenAI training. Federal District Judge Vince Chhabria is poised to rule on this question, after hearing oral arguments in Kadrey v. Meta PlatformsThe Third Circuit Court of Appeals is expected to consider a similar fair use issue in Thomson Reuters v. Ross Intelligence. Courts are well-equipped to resolve this pivotal issue by applying existing law to specific uses and AI technologies. 

Courts Should Reject the Copyright Office’s Fair Use Analysis

The report’s fair use discussion contains some fundamental errors that place a thumb on the scale in favor of rightsholders. Though the report is non-binding, it could influence courts, including in cases like Kadrey, where plaintiffs have already filed a copy of the report and urged the court to defer to its analysis.   

Courts need only accept the Copyright Office’s draft conclusions, however, if they are persuasive. They are not.   

The Office’s fair use analysis is not one the courts should follow. It repeatedly conflates the use of works for training models—a necessary step in the process of building a GenAI model—with the use of the model to create substantially similar works. It also misapplies basic fair use principles and embraces a novel theory of market harm that has never been endorsed by any court.

The first problem is the Copyright Office’s transformative use analysis. Highly transformative uses—those that serve a different purpose than that of the original work—are very likely to be fair. Courts routinely hold that using copyrighted works to build new software and technology—including search engines, video games, and mobile apps—is a highly transformative use because it serves a new and distinct purpose. Here, the original works were created for various purposes and using them to train large language models is surely very different.

The report attempts to sidestep that conclusion by repeatedly ignoring the actual use in question—training —and focusing instead on how the model may be ultimately used. If the model is ultimately used primarily to create a class of works that are similar to the original works on which it was trained, the Office argues, then the intermediate copying can’t be considered transformative. This fundamentally misunderstands transformative use, which should turn on whether a model itself is a new creation with its own distinct purpose, not whether any of its potential uses might affect demand for a work on which it was trained—a dubious standard that runs contrary to decades of precedent.

The Copyright Office’s transformative use analysis also suggests that the fair use analysis should consider whether works were obtained in “bad faith,” and whether developers respected the right “to control” the use of copyrighted works.  But the Supreme Court is skeptical that bad faith has any role to play in the fair use analysis and has made clear that fair use is not a privilege reserved for the well-behaved. And rightsholders don’t have the right to control fair uses—that’s kind of the point.

Finally, the Office adopts a novel and badly misguided theory of “market harm.” Traditionally, the fair use analysis requires courts to consider the effects of the use on the market for the work in question. The Copyright Office suggests instead that courts should consider overall effects of the use of the models to produce generally similar works. By this logic, if a model was trained on a Bridgerton novel—among millions of other works—and was later used by a third party to produce romance novels, that might harm series author Julia Quinn’s bottom line.

This market dilution theory has four fundamental problems. First, like the transformative use analysis, it conflates training with outputs. Second, it’s not supported by any relevant precedent. Third, it’s based entirely on speculation that Bridgerton fans will buy random “romance novels” instead of works produced by a bestselling author they know and love.  This relies on breathtaking assumptions that lack evidence, including that all works in the same genre are good substitutes for each other—regardless of their quality, originality, or acclaim. Lastly, even if competition from other, unique works might reduce sales, it isn’t the type of market harm that weighs against fair use.

Nor is lost revenue from licenses for fair uses a type of market harm that the law should recognize. Prioritizing private licensing market “solutions” over user rights would dramatically expand the market power of major media companies and chill the creativity and innovation that copyright is intended to promote. Indeed, the fair use doctrine exists in part to create breathing room for technological innovation, from the phonograph record to the videocassette recorder to the internet itself. Without fair use, crushing copyright liability could stunt the development of AI technology.

We’re still digesting this report, but our initial review suggests that, on balance, the Copyright Office’s approach to fair use for GenAI training isn’t a dispassionate report on how existing copyright law applies to this new and revolutionary technology. It’s a policy judgment about the value of GenAI technology for future creativity, by an office that has no business making new, free-floating policy decisions.

The courts should not follow the Copyright Office’s speculations about GenAI. They should follow precedent.

Reposted from the EFF’s Deeplinks blog.

Posted on BestNetTech - 17 April 2024 @ 12:08pm

The Motion Picture Association Doesn’t Get To Decide Who The First Amendment Protects

Twelve years ago, internet users spoke up with one voice to reject a law that would build censorship into the internet at a fundamental level. This week, the Motion Picture Association (MPA), a group that represents six giant movie and TV studios, announced that it hoped we’d all forgotten how dangerous this idea was. The MPA is wrong. We remember, and the internet remembers.

What the MPA wants is the power to block entire websites, everywhere in the U.S., using the same tools as repressive regimes like China and Russia. To it, instances of possible copyright infringement should be played like a trump card to shut off our access to entire websites, regardless of the other legal speech hosted there. It is not simply calling for the ability to take down instances of infringement—a power they already have, without even having to ask a judge—but for the keys to the internet. Building new architectures of censorship would hurt everyone, and doesn’t help artists.

The bills known as SOPA/PIPA would have created a new, rapid path for copyright holders like the major studios to use court orders against sites they accuse of infringing copyright. Internet service providers (ISPs) receiving one of those orders would have to block all of their customers from accessing the identified websites. The orders would also apply to domain name registries and registrars, and potentially other companies and organizations that make up the internet’s basic infrastructure. To comply, all of those would have to build new infrastructure dedicated to site-blocking, inviting over-blocking and all kinds of abuse that would censor lawful and important speech.

In other words, the right to choose what websites you visit would be taken away from you and given to giant media companies and ISPs. And the very shape of the internet would have to be changed to allow it.

In 2012, it seemed like SOPA/PIPA, backed by major corporations used to getting what they want from Congress, was on the fast track to becoming law. But a grassroots movement of diverse Internet communities came together to fight it. Digital rights groups like EFF, Public Knowledge, and many more joined with editor communities from sites like Reddit and Wikipedia to speak up. Newly formed grassroots groups like Demand Progress and Fight for the Future added their voices to those calling out the dangers of this new form of censorship. In the final days of the campaign, giant tech companies like Google and Facebook (now Meta) joined in opposition as well.

What resulted was one of the biggest protests ever seen against a piece of legislation. Congress was flooded with calls and emails from ordinary people concerned about this steamroller of censorship. Members of Congress raced one another to withdraw their support for the bills. The bills died, and so did site blocking legislation in the US. It was, all told, a success story for the public interest.

Even the MPA, one of the biggest forces behind SOPA/PIPA, claimed to have moved on. But we never believed it, and they proved us right time and time again. The MPA backed site-blocking laws in other countries. Rightsholders continued to ask US courts for site-blocking orders, often winning them without a new law. Even the lobbying of Congress for a new law never really went away. It’s just that today, with MPA president Charles Rivkin openly calling on Congress “to enact judicial site-blocking legislation here in the United States,” the MPA is taking its mask off.

Things have changed since 2012. Tech platforms that were once seen as innovators have become behemoths, part of the establishment rather than underdogs. The Silicon Valley-based video streamer Netflix illustrated this when it joined MPA in 2019. And the entertainment companies have also tried to pivot into being tech companies. Somehow, they are adopting each other’s worst aspects.

But it’s important not to let those changes hide the fact that those hurt by this proposal are not Big Tech but regular internet users. Internet platforms big and small are still where ordinary users and creators find their voice, connect with audiences, and participate in politics and culture, mostly in legal—and legally protected—ways. Filmmakers who can’t get a distribution deal from a giant movie house still reach audiences on YouTube. Culture critics still reach audiences through zines and newsletters. The typical users of these platforms don’t have the giant megaphones of major studios, record labels, or publishers. Site-blocking legislation, whether called SOPA/PIPA, “no fault injunctions,” or by any other name, still threatens the free expression of all of these citizens and creators.

No matter what the MPA wants to claim, this does not help artists. Artists want their work seen, not locked away for a tax write-off. They wanted a fair deal, not nearly five months of strikes. They want studios to make more small and midsize films and to take a chance on new voices. They have been incredibly clear about what they want, and this is not it.

Even if Rivkin’s claim of an “unflinching commitment to the First Amendment” was credible from a group that seems to think it has a monopoly on free expression—and which just tried to consign the future of its own artists to the gig economy—a site-blocking law would not be used only by Hollywood studios. Anyone with a copyright and the means to hire a lawyer could wield the hammer of site-blocking. And here’s the thing: we already know that copyright claims are used as tools of censorship.

The notice-and-takedown system created by the Digital Millennium Copyright Act, for example, is abused time and again by people who claim to be enforcing their copyrights, and also by folks who simply want to make speech they don’t like disappear from the Internet. Even without a site-blocking law, major record labels and US Immigration and Customs Enforcement shut down a popular hip hop music blog and kept it off the internet for over a year without ever showing that it infringed copyright. And unscrupulous characters use accusations of infringement to extort money from website owners, or even force them into carrying spam links.

This censorious abuse, whether intentional or accidental, is far more damaging when it targets the internet’s infrastructure. Blocking entire websites or groups of websites is imprecise, inevitably bringing down lawful speech along with whatever was targeted. For example, suits by Microsoft intended to shut down malicious botnets caused thousands of legitimate users to lose access to the domain names they depended on. There is, in short, no effective safeguard on a new censorship power that would be the internet’s version of police seizing printing presses.

Even if this didn’t endanger free expression on its own, once new tools exist, they can be used for more than copyright. Just as malfunctioning copyright filters were adapted into the malfunctioning filters used for “adult content” on tumblr, so can means of site blocking. The major companies of a single industry should not get to dictate the future of free speech online.

Why the MPA is announcing this now is anyone’s guess. They might think no one cares anymore. They’re wrong. Internet users rejected site blocking in 2012 and they reject it today.

Republished from the EFF Deep Links blog.