Let’s Not Flip Sides On IP Maximalism Because Of AI
from the copyright-is-the-wrong-tool dept
Copyright policy is a sticky tricky thing, and there are battles that have been fought for decades among public and corporate interests. Typically, it’s the corporate interests that win — especially the content industry. We’ve seen power, and copyrights, collect among a small group of content companies because of this. But there is one significant win that the public interest has been able to defend all these years: Fair Use.
Fair use’s importance has only grown over the years. Put simply, fair use allows people limited use of copyrighted material without permission. Fair use’s foundations are in commentary, criticism, and parody. However, fair use has arguably filled in important gaps to allow us to basically exist on social media. That’s because there are open questions on what is and isn’t copyright infringement, and things as simple as retweeting or linking could theoretically get us in trouble. Fair use also allows a lot of art to exist, because a lot of art critiques or comments on older art. On the flip side, when fair use was ruled to not cover music sampling it basically killed a lot of creative sampling in hip hop music. Now popular sample-based music is relatively tame and tends to use the same library of samples.
Fair use (probably) also protects the creator industry. Many people make a living streaming video games or making content around playing video games. All of that could violate copyright laws. We don’t know the extent of risk here, because it hasn’t been fully tested, but we do know that videogame makers have claimed videogame streaming content as copyrighted material. We also know that in Japan, which doesn’t have fair use, that a streamer got two years in jail for posting Let’s Play videos. A lot of creators also make “react” content, which also relies on fair use protection.
Blowing up Fair Use
Considering the importance of fair use, and the historically bad behavior of the content industry towards ordinary people, it’s surprising that a lot of public interest advocates want to blow it up to hurt AI companies. This is unfortunate, but not particularly surprising. Content industry lobbying has inflated copyright protections into a pretty big sledgehammer, and when you really want to smash something you often look for a sledgehammer. For example, copyright and right of publicity (a somewhat related state-level IP regime) were the first tools people turned to to protect victims when revenge porn first became a big problem.
Similarly, some public interest advocates are turning to copyright to stop AI from being trained on content without permission. However, that use is almost certainly a fair use (if it’s copyright infringement at all) and that’s a good thing. The ability of people to use computers to analyze content without permission is extremely useful, and it would be bad to weaken or destroy fair use just to stop companies from doing that in a socially problematic way. The best way to stop bad things is with policy purposefully made to address the whole problem. And these uses of copyright law often plays into the hands of powerful interests — the copyright industry would love the chance to turn the public interest advocacy community against itself in order to kill fair use.
I’m not saying that there aren’t issues with AI that need to be addressed, especially worker exploitation. AI art generators can be especially infuriating for artists: they use a lot while giving back little. In fact, these generators are arguably being built to replace artists rather than to provide artists with new tools. It can be attractive to throw anything in the way to slow it down. But copyright, especially copyright maximalism, has done a terrible job of preventing artist exploitation.
Porting “on a computer” to copyright
One of the biggest public interest fights in patent law has been against “on a computer” software patents that clogged up the system and led to a number of patent infringement suits against small businesses for silly claimed inventions. The basics of the problem is this: it was initially allowed to claim an invention in doing something that was already known, but on a computer. These on a computer patents have been greatly restricted through Supreme Court rulings (which special interests would like to overturn). However, the bad effects of software patents still exist today, as do patent trolls seeking to exploit them.
This current fight over copyright in training data reminds of this same problem. For example, if a writer wanted to study romance novels to find out what is popular it would be perfectly acceptable under copyright policy for them to read and analyze a lot of popular romance novels and to use that analysis to take the most successful parts of those novels to create a new novel. It is also perfectly acceptable under copyright law for an artist to study a particular artist and replicate that artists style in their own works. But using an AI to do that analysis, doing it “on a computer,” is now suspect.
This is short sighted for a number of reasons, but one I’d like to highlight is how this shrinking of fair use is difficult to contain. We are talking about an area in which the question of whether loading files into RAM is “copying” under copyright law (and therefore needs permission or is a violation) is an actual policy debate that public interest advocates have to fight. If using content as training data becomes a copyright violation, what’s the limiting principle? What kinds of computer analysis would no longer be protected under fair use?
I should also point out that IP maximalization is the easiest way to build oligopolies. Big companies will be able to figure out how to navigate the maze of rights necessary to build a model, and existing models will likely be grandfathered in (with a few lawsuits to get through). However, it will be impossible for any new company or new open source model to be created. Dealing with rights at scale is a problem so significant that even the rightsholder industry has trouble tracking them. And information about rights has been withheld to leverage better deals due to the risk (and high costs) of accidentally infringing someone’s rights.
Matthew Lane is a public interest advocate in DC focusing on tech and IP policy. This post was originally published to his Substack.
Filed Under: ai, copyright, copyright maximalism, fair use


Comments on “Let’s Not Flip Sides On IP Maximalism Because Of AI”
Why should it be “fair use” for a corporation worth billions to co-opt the work of creators who never agreed to have their work used to train AI models and who haven’t and won’t be compensated for that use, as the law currently stands?
People like to compare training of an AI model to a human reading and learning from the same work, but these simply aren’t the same thing. For one, AI models aren’t humans, they’re a product being created largely for the profit of corporations. They certainly don’t have the same inherent rights we assign to another human, and we’re a long way off from even considering that possibility.
More importantly, AI models don’t learn like humans. Just look at the recent work where researches have been able to get AI models to disgorge entire sections of the works they’ve been trained on. The very same works we’re supposed to believe explicitly aren’t being copied in whole or in part as these models are trained from them.
Having some respect for creators in light of AI developments absolutely isn’t “blowing up” fair use. I think this is one area where BestNetTech’s take will not age well.
Re:
Fair use does not require compensation. Agreement is never part of the equation in fair use. The whole point of fair use is to be able to use copyrighted materials without the copyright holder’s permission.
No copyright on the art? Then the artist is out of luck!
Re:
For the same reason that Students can learn from books etc. that they buy, borrow or see in an art gallery or museum. it is the same reason that music genres exist, people can learn ways of doing things from other peoples works.
Those claiming it is unfair are asking for free money from computer analysis, which is not the same as copying their works for profit.
Further people trying to make a living have to compete with everybody on the planet who publishes their works via the Internet. Indeed the idea that creative works were rare and valuable come from a pre-Internet world, where publishers selected a few works from the many submitted for publication. Those who won that lottery have an overinflated idea of how rare creativity is, and how much their work should be worth.
Re: Re:
When I last looked, the books were not free, and coding books have explicit permissions defined for the information and the code included in them. So it’s not that I can go to a bookstore, get the book, use it the way I want and just put it back.
Also, some of the things we publish for free on the internet maybe is free as in beer, but not free as in speech. My blog posts are licensed with CC-BY-NC. You can’t just parse and transform/derive them, and strip the license away. You can only share it as-is, with the license attached, which none of these AI systems do.
Putting the shenanigans of parsing GPL code, removing license and offering its derivations to any party who pays of API access, tons of source-available commercial code repositories are also parsed and offered to unsuspecting people.
Will corporations say that, “oh this is our source-available licensed codebase parsed by an AI and offered to you, so this is fair use. You can use our copyrighted algorithms for free because it’s fair use”?
Haha. Of. Course. NOT.
Corporations just bend the available things in a way to profit themselves, while sucking the public dry. They cry “Fair Use!” because it’s profitable for them, not because it’s a cut and dry case of fair use.
Re: Re: Re:
Now that’s a crazy take on fair use if I’ve ever seen one.
You do realize that it jives really well with Marxist thought and there’s only one outcome left if you think like that…
Re: Re: Re:
Only n that you cannot just copy and paste their example code into a product, but by reading those books you have learnt new things about coding that you can practice, and the authors cannot impose costs on you for practicing what you learn from their books.
Re:
Are you saying that a human who has read a book can’t reproduce sections from it with some trial and error and the correct prompting? How about a savant with perfect recall?
And I’m a bit interested in what you mean by “entire sections”? Section can refer to a couple of sentences, a paragraph, a page or even a chapter. So how much text could an LLM output with the right prompting that corresponded to a book?
This tells me you don’t even have an inkling of how LLM’s actually work because the statement above is proof of that you think they “copy” the input.
Re: Re:
Not sure about a book, but LLMs have already been shown to reconstruct images in their training sets:
https://www.usenix.org/system/files/usenixsecurity23-carlini.pdf
https://arxiv.org/pdf/2212.03860.pdf
As far as I’m aware, there’s no real reason you couldn’t do something similar with text. It might be a bit harder, there might be constraints on images that aren’t present for text, which could make it easier to force.
There are things you can do to reduce this, but it is a known problem.
I mean, they kind of are, in a sense. It depends on what you mean by “copy”. The data set is basically used to build a giant parameter space. That parameter space encodes information from that original database (with some tricks to try to avoid falling into a local minima by copying a piece exactly).
It’s not a literal copy as we typically think of it, but the data is encoded in the transformation. Which is why you can do things like pull out something from your training set. Because (a lot of) that information is encoded in the parameter space. It is in some sense a form of lossy copying.
Re: Re: Re:
I mean, I can recreate art I’ve seen with a pen and paper, but that doesn’t make learning how to do it copyright infringement. The infringement happens when I actually do it.
Re: Re: Re:
If it’s lossy copying then that defeats the point of the copying, doesn’t it?
If the lossy copy was being sold or treated as an equivalent of the original, then sure, there might be a point in claiming that it’s some kind of infringement, but it’s not.
Re: Re: Re:2
Not necessarily. If I want to watch (insert movie not yet available commercially here), I’m not going to complain much if a 1080i .mp4 is all I can get, especially if the price is free.
Re: Re: Re:3
And is a 1080i.mp4 format being made available to consumers for a price?
Re: Re: Re:4
Not in my region, as I’ve already said.
Re: Re: Re:5
That’s the point.
If a product isn’t available to consumers to begin with, it’s pretty ballsy of the rightsholders to complain that something that was never offered was stolen.
Re: Re: Re:
And how much time was spent on creating a prompt to get the desirable output?
It’s one thing to simply ask for Da Vinci’s Madonna and the AI outputs a good facsimile, it’s entirely different if you have to spend time to craft a prompt through trial and error that produces the facsimile.
Re: Re: Re:
Somewhat debatable, as the examples I have seen are images where there are thousands of similar images available, such as Football images. Ask an AI to create an image of a player kicking a ball, or the moon on the horizon, or a sunset, you should not be surprised if it looks very similar to one or more of the many images of those topics.
Re: Re: Re:
It would be mush more accurate to say that humans have used AI to reconstruct images. They have to provide inputs, in an iterative process, to get the result they want from an AI.
Re: Re: Re:
“Not sure about a book, but humans have already been shown to reconstruct images in their school and college art rooms.”
That’s part of education. Training is a form of education for both humans and machines. Your point?
Re:
So you want to nickle and dime human learning as well?
You’re worse than the Middle Ages, and they promoted free or at least cheap education, just not in the languages the rich would learn…
Re:
The same argument was made against remixes, and… suffice to say, that was not a take that aged well.
You can, in fact, have respect for both creators and fair use. Exceptions to copyright matter to creators more than they initially realize – because imagine what happens if a large corporation worth billions then comes around and sues individual content creators because the corporation thinks that the individual creators’ work might resemble something in their gigantic catalog. Even if the claim was baseless, without exceptions to copyright, how long do you think the creators will last?
Re: Re:
Please don’t give the RIAA, MPAA and Nintendo ideas.
Re: Re: Re:
Don’t need to. They’ve been spending the last couple decades doing the exact same thing.
Which is why content creators need to realize what kind of power they’ll be handing to large corporations if they keep cheering on copyright as their savior against AI.
Re: Re:
Even if the claim was baseless, without exceptions to copyright, how long do you think the creators will last?
Here, that’s not even the biggest problem. The problem is that authors and artists are expecting to solve the problem by involving copyright, on things that aren’t covered by copyright to begin with.
An author or artist has no copyright claim over content that they had no personal involvement or decision in creating. Another person mimicking their art or writing style has not committed copyright infringement because styles are not protected by copyright.
Asking for copyright to be involved has the same energy as game devs DMCAing reviews or Milorad Truklja/Thomas Goolnik suing a website mentioning their name.
Re: youre right and youre wrong
“For one, AI models aren’t humans,”
True, but AI are tools that a human is using.
If that human had the right to study a copyrighted work and learn from it then he also has the right to use an AI to study that work and learn from it. The AI has no rights and cannot break the law. It is the human using the AI that the law considers.
Re:
If I obtain a copy of an ebook that I believe to be legal (whether purchased or Public Domain), what difference does it make whether I read it myself or use my ereader’s TTS to read it to me? Similarly, if a digital copy of a work is obtained to train AI by people who have no reason to think it’s not legit, what difference does it make whether it’s typed into the computer or directly read by it? In fact, the second position has more legality than the first because no copying actually occurred.
Re: Re:
Something I said a while back, was that those who complain about AI is using the same reasoning that some people use when they file for a patent for a well known process but on a computer.
A computer that creates AI training sets isn’t using some magical process, a person can also create a training set manually by just using pen and paper plus a bunch of books (they may not complete it within their lifetime though), and everything a person is allowed to do to acquire “reading material” is exactly the same for suppling a training set with data. So many of the complaints about AI boils down to something as simple as “its automated” and this disconnect in the detractors thinking will ultimately make lawsuits hinging on that factor alone fail.
This comment has been flagged by the community. Click here to show it.
More anti-capitalist bullshit from BestNetTech. Surprise, Surprise (not).
Re:
Why is it good for the workers to be exploited?
Re:
You’re in favour of worker exploitation, then?
This comment has been flagged by the community. Click here to show it.
Re: Re:
Absolutely, especially when I’m a shareholder and the employees are soon to be replaced with AI or automation anyway.
Re: Re: Re:
Ironically, it’s the CEOs and managers whose work is more apt to be replaced with artificial intelligence. They’re just arbitrary decision makers taking credit for the work of people lower in the hierarchy. AI can do that easily.
Re:
Idk, worker exploitation is bad but most of the so-called “exploitation” I’ve seen complained of is like, not paying extortion money for scraping or putting clip artists out of a job. ie, not real problems.
The other thing about AI, especially when it comes to copyright is that none of the arguments are all that new.
In fact, most of them were done in some variation about 150 years ago when photography came along.
Re:
The difference is that case law has become much more complicated in the ensuing 150 years, and the Copyright Act has been updated multiple times.
As a result, the arguments are all old, but their application to the current copyright landscape is novel enough that people who weren’t alive last time around have a hard time seeing the potential simplicity, and instead want to torture the existing laws some more.
Re: Re:
The most significant change is automatic copyright. It used to be that if you wanted a copyright in your drawing or photograph, you’d have to write “copyright” on it at the very least; maybe even register and pay a fee. Few people did, so most stuff was not copyrighted. In the USA, that didn’t fully change till 1989.
Re:
And removed the employment of most portrait and landscape artists. That did not stop people painting portraits and landscapes as a hobby, and few managing to make a living by being well above average in ability.
Re: Re:
People are entitled to make money doing the same thing they did before a massive technological change, apparently, to hear certain people talk about it. Great example of the unfortunate conservatism that can accompany labor advocacy.
Re: Re: Re:
The buggy whip manufacturing industry should be granted government subsidies until sufficient market recovery occurs through forced patronage!
Well it’s not like anything changed really. The rich and powerful are going to do what they want and the law won’t matter.
Big companies to everyone else. Try and sue us you fucking poor losers to prove we did anything wrong!
Re:
The rich and powerful are not a monolith. They have a shared class interest but the goals of Disney et al are at odds in many ways with the goals of Google et al. The copyright cartels would love to expand restrictions and hamstring tech companies whose business models rely on fair use and safe harbor. In the end, regulations will always benefit the biggest players, and whichever AI company can claw its way to the top will do what it wants, but I don’t think we’re there yet.
Copyright reform
There would have been much less “infringing” if copyright monopoly wasn’t so damn long. Just saying.
“It is also perfectly acceptable under copyright law for an artist to study a particular artist and replicate that artists style in their own works. But using an AI to do that analysis, doing it “on a computer,” is now suspect.”
So does that mean I’m in trouble because I copied the style of a research paper when writing my own on a tablet?