Search Engines, AI, And The Long Fight Over Fair Use
Long before generative AI, copyright holders warned that new technologies for reading and analyzing information would destroy creativity. Internet search engines, they argued, were infringement machines—tools that copied copyrighted works at scale without permission. As they had with earlier information technologies like the photocopier and the VCR, copyright owners sued.
Courts disagreed. They recognized that copying works in order to understand, index, and locate information is a classic fair use—and a necessary condition for a free and open internet.
Today, the same argument is being recycled against AI. It’s whether copyright owners should be allowed to control how others analyze, reuse, and build on existing works.
Fair Use Protects Analysis—Even When It’s Automated
U.S. courts have long recognized that copying for purposes of analysis, indexing, and learning is a classic fair use. That principle didn’t originate with artificial intelligence. It doesn’t disappear just because the processes are performed by a machine.
Copying that works in order to understand them, extract information from them, or make them searchable is transformative and lawful. That’s why search engines can index the web, libraries can make digital indexes, and researchers can analyze large collections of text and data without negotiating licenses from millions of rightsholders. These uses don’t substitute for the original works; they enable new forms of knowledge and expression.
Training AI models fits squarely within that tradition. An AI system learns by analyzing patterns across many works. The purpose of that copying is not to reproduce or replace the original texts, but to extract statistical relationships that allow the AI system to generate new outputs. That is the hallmark of a transformative use.
Attacking AI training on copyright grounds misunderstands what’s at stake. If copyright law is expanded to require permission for analyzing or learning from existing works, the damage won’t be limited to generative AI tools. It could threaten long-standing practices in machine learning and text-and-data mining that underpin research in science, medicine, and technology.
Researchers already rely on fair use to analyze massive datasets such as scientific literature. Requiring licenses for these uses would often be impractical or impossible, and it would advantage only the largest companies with the money to negotiate blanket deals. Fair use exists to prevent copyright from becoming a barrier to understanding the world. The law has protected learning before. It should continue to do so now, even when that learning is automated.
A Road Forward For AI Training And Fair Use
One court has already shown how these cases should be analyzed. In Bartz v. Anthropic, the court found that using copyrighted works to train an AI model is a highly transformative use. Training is a kind of studying how language works—not about reproducing or supplanting the original books. Any harm to the market for the original works was speculative.
The court in Bartz rejected the idea that an AI model might infringe because, in some abstract sense, its output competes with existing works. While EFF disagrees with other parts of the decision, the court’s ruling on AI training and fair use offers a good approach. Courts should focus on whether training is transformative and non-substitutive, not on fear-based speculation about how a new tool could affect someone’s market share.
AI Can Create Problems, But Expanding Copyright Is the Wrong Fix
Workers’ concerns about automation and displacement are real and should not be ignored. But copyright is the wrong tool to address them. Managing economic transitions and protecting workers during turbulent times may be core functions of government, but copyright law doesn’t help with that task in the slightest. Expanding copyright control over learning and analysis won’t stop new forms of worker automation—it never has. But it will distort copyright law and undermine free expression.
Broad licensing mandates may also do harm by entrenching the current biggest incumbent companies. Only the largest tech firms can afford to negotiate massive licensing deals covering millions of works. Smaller developers, research teams, nonprofits, and open-source projects will all get locked out. Copyright expansion won’t restrain Big Tech—it will give it a new advantage.
Fair Use Still Matters
Learning from prior work is foundational to free expression. Rightsholders cannot be allowed to control it. Courts have rejected that move before, and they should do so again.
Search, indexing, and analysis didn’t destroy creativity. Nor did the photocopier, nor the VCR. They expanded speech, access to knowledge, and participation in culture. Artificial intelligence raises hard new questions, but fair use remains the right starting point for thinking about training.
Republished from the EFF’s Deeplinks blog.
Re: Re: Cold Comfort, but…
I should add, this is medical device, not software.
Re: Cold Comfort, but…
NYU did just file its first patent lawsuit (the first from what I can tell). https://dockets.justia.com/docket/delaware/dedce/1:2021cv00813/75653
Congrats and good luck on the next 20!
I started reading BestNetTech in 2007 when I had to start covering intellectual property for a legal newspaper, but really knew nothing about it. It was the best crash course I could have found, and such an important and intelligent counter-point to other points of view I was hearing. Congratulations to Mike and the whole team, and here's to another 20.
History of Zimmerman / Eon-Net
Zimmerman was actually first sanctioned for patent litigation behavior back in 2006 in 2006. However, those sanctions were overturned overturned by a different Federal Circuit panel.
The meaning of "independent invention"
Mike, thanks for the writeup, I appreciate your thoughts and the comments here from other viewpoints as well. To Lonnie Holder——and I honestly ask this as someone just hunting for the right words to describe patent disputes—— Isn't any patent defendant who has not been accused of copying an "independent inventor"? We know that 1) they have (or had) a product of some kind on the market, and 2) they are not accused of copying it. Anyone who creates and markets a product of some kind that isn't exactly identical to another product is an inventor on some level, right? And since copying, at least, held in low esteem by society, shouldn't their invention be considered independent until someone at least alleges otherwise?
camera phones in courtrooms
Two years ago I was a reporter covering crime in Seattle and I was one of the last ones without a camera phone. At that time they would let you into the courtroom in the county jail building with a phone, but not with a camera; if you had a camera in your phone, as most reporters did, you had to leave it up front. It was a big advantage to me not having a camera phone then.
Bad idea
I agree with Hulser that the issue isn't so much ethics as just a foolish idea from a business perspective.
If I were a reporter at the paper behind this stunt I would be upset! I'd feel like I'm being undermined by my own boss. The potential damage is to the newspaper's reputation and the trust of readers. If a newspaper is willing to create a fake ad and say "just kidding!" in fine print, you have to wonder if the next "experiment" will be: "What happens when we write a fake story?"
The problem isn't harm to consumers; the problem is the newspaper harming itself. Newspapers are in the fact-verification business; their marketing departments need to be cognizant of that.
I'm going to ask around, but on first glance I'm not sure this decision will prevent the type of mass-defendant lawsuit described in Mike's link, unless the original manufacturer has a license (as Intel did).
In many cases, including the 92-defendant case Mike linked to, the manufacturer does _not_ have a license, and is also an alleged infringer. It's just more lucrative to go after the retailer clients than the manufacturer of the device. Not sure that Quanta v. LG will stop that.
Amazon owes the country something
Targeted taxes suck; they're unfair and inefficient. But the problem is that chambers of commerce and other business lobbyists fight the FAIR taxes, too. Eventually something comes down the pipe that affects one company or industry more than another because it's politically effective (but still exceedingly difficult) to split the business lobby.
I'd like to see the businesses that oppose illogical taxes on business talk about what they ARE willing to pay to be socially responsible members of society. We don't hear too much of that.