Tori Noble's BestNetTech Profile

Tori Noble

About Tori Noble

Posted on BestNetTech - 22 August 2025 @ 11:58am

President Trump’s War On “Woke AI” Is A Civil Liberties Nightmare

The White House’s recently-unveiled “AI Action Plan” wages war on so-called “woke AI”—including large language models (LLMs) that provide information inconsistent with the administration’s views on climate change, gender, and other issues. It also targets measures designed to mitigate the generation of racial and gender biased content and even hate speech. The reproduction of this bias is a pernicious problem that AI developers have struggled to solve for over a decade.

A new executive order called “Preventing Woke AI in the Federal Government,” released alongside the AI Action Plan, seeks to strong-arm AI companies into modifying their models to conform with the Trump Administration’s ideological agenda.

The executive order requires AI companies that receive federal contracts to prove that their LLMs are free from purported “ideological biases” like “diversity, equity, and inclusion.” This heavy-handed censorship will not make models more accurate or “trustworthy,” as the Trump Administration claims, but is a blatant attempt to censor the development of LLMs and restrict them as a tool of expression and information access. While the First Amendment permits the government to choose to purchase only services that reflect government viewpoints, the government may not use that power to influence what services and information are available to the public. Lucrative government contracts can push commercial companies to implement features (or biases) that they wouldn’t otherwise, and those often roll down to the user. Doing so would impact the 60 percent of Americans who get information from LLMs, and it would force developers to roll back efforts to reduce biases—making the models much less accurate, and far more likely to cause harm, especially in the hands of the government. 

Less Accuracy, More Bias and Discrimination

It’s no secret that AI models—including gen AI—tend to discriminate against racial and gender minorities. AI models use machine learning to identify and reproduce patterns in data that they are “trained” on. If the training data reflects biases against racial, ethnic, and gender minorities—which it often does—then the AI model will “learn” to discriminate against those groups. In other words, garbage in, garbage out. Models also often reflect the biases of the people who train, test, and evaluate them. 

This is true across different types of AI. For example, “predictive policing” tools trained on arrest data that reflects overpolicing of black neighborhoods frequently recommend heightened levels of policing in those neighborhoods, often based on inaccurate predictions that crime will occur there. Generative AI models are also implicated. LLMs already recommend more criminal convictions, harsher sentences, and less prestigious jobs for people of color. Despite that people of color account for less than half of the U.S. prison population, 80 percent of Stable Diffusion’s AI-generated images of inmates have darker skin. Over 90 percent of AI-generated images of judges were men; in real life, 34 percent of judges are women. 

These models aren’t just biased—they’re fundamentally incorrect. Race and gender aren’t objective criteria for deciding who gets hired or convicted of a crime. Those discriminatory decisions reflected trends in the training data that could be caused by bias or chance—not some “objective” reality. Setting fairness aside, biased models are just worse models: they make more mistakes, more often. Efforts to reduce bias-induced errors will ultimately make models more accurate, not less. 

Biased LLMs Cause Serious Harm—Especially in the Hands of the Government

But inaccuracy is far from the only problem. When government agencies start using biased AI to make decisions, real people suffer. Government officials routinely make decisions that impact people’s personal freedom and access to financial resources, healthcare, housing, and more. The White House’s AI Action Plan calls for a massive increase in agencies’ use of LLMs and other AI—while all but requiring the use of biased models that automate systemic, historical injustice. Using AI simply to entrench the way things have always been done squanders the promise of this new technology.

We need strong safeguards to prevent government agencies from procuring biased, harmful AI tools. In a series of executive orders, as well as his AI Action Plan, the Trump Administration has rolled back the already-feeble Biden-era AI safeguards. This makes AI-enabled civil rights abuses far more likely, putting everyone’s rights at risk. 

And the Administration could easily exploit the new rules to pressure companies to make publicly available models worse, too. Corporations like healthcare companies and landlords increasingly use AI to make high-impact decisions about people, so more biased commercial models would also cause harm. 

We have argued against using machine learning to make predictive policing decisions or other punitive judgments for just these reasons, and will continue to protect your right not to be subject to biased government determinations influenced by machine learning.

Originally published to the EFF’s Deeplinks blog.

Posted on BestNetTech - 21 May 2025 @ 12:46pm

The U.S. Copyright Office’s Draft Report On AI Training Errs On Fair Use

Within the next decade, generative AI could join computers and electricity as one of the most transformational technologies in history, with all of the promise and peril that implies. Governments’ responses to GenAI—including new legal precedents—need to thoughtfully address real-world harms without destroying the public benefits GenAI can offer. Unfortunately, the U.S. Copyright Office’s rushed draft report on AI training misses the mark.

The Report Bungles Fair Use

Released amidst a set of controversial job terminations, the Copyright Office’s report covers a wide range of issues with varying degrees of nuance. But on the core legal question—whether using copyrighted works to train GenAI is a fair use—it stumbles badly. The report misapplies long-settled fair use principles and ultimately puts a thumb on the scale in favor of copyright owners at the expense of creativity and innovation.

To work effectively, today’s GenAI systems need to be trained on very large collections of human-created works—probably millions of them. At this scale, locating copyright holders and getting their permission is daunting for even the biggest and wealthiest AI companies, and impossible for smaller competitors. If training makes fair use of copyrighted works, however, then no permission is needed.

Right now, courts are considering dozens of lawsuits that raise the question of fair use for GenAI training. Federal District Judge Vince Chhabria is poised to rule on this question, after hearing oral arguments in Kadrey v. Meta PlatformsThe Third Circuit Court of Appeals is expected to consider a similar fair use issue in Thomson Reuters v. Ross Intelligence. Courts are well-equipped to resolve this pivotal issue by applying existing law to specific uses and AI technologies. 

Courts Should Reject the Copyright Office’s Fair Use Analysis

The report’s fair use discussion contains some fundamental errors that place a thumb on the scale in favor of rightsholders. Though the report is non-binding, it could influence courts, including in cases like Kadrey, where plaintiffs have already filed a copy of the report and urged the court to defer to its analysis.   

Courts need only accept the Copyright Office’s draft conclusions, however, if they are persuasive. They are not.   

The Office’s fair use analysis is not one the courts should follow. It repeatedly conflates the use of works for training models—a necessary step in the process of building a GenAI model—with the use of the model to create substantially similar works. It also misapplies basic fair use principles and embraces a novel theory of market harm that has never been endorsed by any court.

The first problem is the Copyright Office’s transformative use analysis. Highly transformative uses—those that serve a different purpose than that of the original work—are very likely to be fair. Courts routinely hold that using copyrighted works to build new software and technology—including search engines, video games, and mobile apps—is a highly transformative use because it serves a new and distinct purpose. Here, the original works were created for various purposes and using them to train large language models is surely very different.

The report attempts to sidestep that conclusion by repeatedly ignoring the actual use in question—training —and focusing instead on how the model may be ultimately used. If the model is ultimately used primarily to create a class of works that are similar to the original works on which it was trained, the Office argues, then the intermediate copying can’t be considered transformative. This fundamentally misunderstands transformative use, which should turn on whether a model itself is a new creation with its own distinct purpose, not whether any of its potential uses might affect demand for a work on which it was trained—a dubious standard that runs contrary to decades of precedent.

The Copyright Office’s transformative use analysis also suggests that the fair use analysis should consider whether works were obtained in “bad faith,” and whether developers respected the right “to control” the use of copyrighted works.  But the Supreme Court is skeptical that bad faith has any role to play in the fair use analysis and has made clear that fair use is not a privilege reserved for the well-behaved. And rightsholders don’t have the right to control fair uses—that’s kind of the point.

Finally, the Office adopts a novel and badly misguided theory of “market harm.” Traditionally, the fair use analysis requires courts to consider the effects of the use on the market for the work in question. The Copyright Office suggests instead that courts should consider overall effects of the use of the models to produce generally similar works. By this logic, if a model was trained on a Bridgerton novel—among millions of other works—and was later used by a third party to produce romance novels, that might harm series author Julia Quinn’s bottom line.

This market dilution theory has four fundamental problems. First, like the transformative use analysis, it conflates training with outputs. Second, it’s not supported by any relevant precedent. Third, it’s based entirely on speculation that Bridgerton fans will buy random “romance novels” instead of works produced by a bestselling author they know and love.  This relies on breathtaking assumptions that lack evidence, including that all works in the same genre are good substitutes for each other—regardless of their quality, originality, or acclaim. Lastly, even if competition from other, unique works might reduce sales, it isn’t the type of market harm that weighs against fair use.

Nor is lost revenue from licenses for fair uses a type of market harm that the law should recognize. Prioritizing private licensing market “solutions” over user rights would dramatically expand the market power of major media companies and chill the creativity and innovation that copyright is intended to promote. Indeed, the fair use doctrine exists in part to create breathing room for technological innovation, from the phonograph record to the videocassette recorder to the internet itself. Without fair use, crushing copyright liability could stunt the development of AI technology.

We’re still digesting this report, but our initial review suggests that, on balance, the Copyright Office’s approach to fair use for GenAI training isn’t a dispassionate report on how existing copyright law applies to this new and revolutionary technology. It’s a policy judgment about the value of GenAI technology for future creativity, by an office that has no business making new, free-floating policy decisions.

The courts should not follow the Copyright Office’s speculations about GenAI. They should follow precedent.

Reposted from the EFF’s Deeplinks blog.

Posted on BestNetTech - 28 February 2025 @ 10:43am

AI And Copyright: Expanding Copyright Hurts Everyone—Here’s What to Do Instead

You shouldn’t need a permission slip to read a webpage—whether you do it with your own eyes, or use software to help. AI is a category of general-purpose tools with myriad beneficial uses. Requiring developers to license the materials needed to create this technology threatens the development of more innovative and inclusive AI models, as well as important uses of AI as a tool for expression and scientific research.  

Threats to Socially Valuable Research and Innovation 

Requiring researchers to license fair uses of AI training data could make socially valuable research based on machine learning (ML) and even text and data mining (TDM) prohibitively complicated and expensive, if not impossible. Researchers have relied on fair use to conduct TDM research for a decade, leading to important advancements in myriad fields. However, licensing the vast quantity of works that high-quality TDM research requires is frequently cost-prohibitive and practically infeasible.  

Fair use protects ML and TDM research for good reason. Without fair use, copyright would hinder important scientific advancements that benefit all of us. Empirical studies back this up: research using TDM methodologies are more common in countries that protect TDM research from copyright control; in countries that don’t, copyright restrictions stymie beneficial research. It’s easy to see why: it would be impossible to identify and negotiate with millions of different copyright owners to analyze, say, text from the internet. 

The stakes are high, because ML is critical to helping us interpret the world around us. It’s being used by researchers to understand everything from space nebulae to the proteins in our bodies. When the task requires crunching a huge amount of data, such as the data generated by the world’s telescopes, ML helps rapidly sift through the information to identify features of potential interest to researchers. For example, scientists are using AlphaFold, a deep learning tool, to understand biological processes and develop drugs that target disease-causing malfunctions in those processes. The developers released an open-source version of AlphaFold, making it available to researchers around the world. Other developers have already iterated upon AlphaFold to build transformative new tools.  

Threats to Competition 

Requiring AI developers to get authorization from rightsholders before training models on copyrighted works would limit competition to companies that have their own trove of training data, or the means to strike a deal with such a company. This would result in all the usual harms of limited competition—higher costs, worse service, and heightened security risks—as well as reducing the variety of expression used to train such tools and the expression allowed to users seeking to express themselves with the aid of AI. As the Federal Trade Commission recently explained, if a handful of companies control AI training data, “they may be able to leverage their control to dampen or distort competition in generative AI markets” and “wield outsized influence over a significant swath of economic activity.” 

Legacy gatekeepers have already used copyright to stifle access to information and the creation of new tools for understanding it. Consider, for example, Thomson Reuters v. Ross Intelligence, widely considered to be the first lawsuit over AI training rights ever filed. Ross Intelligence sought to disrupt the legal research duopoly of Westlaw and LexisNexis by offering a new AI-based system. The startup attempted to license the right to train its model on Westlaw’s summaries of public domain judicial opinions and its method for organizing cases. Westlaw refused to grant the license and sued its tiny rival for copyright infringement. Ultimately, the lawsuit forced the startup out of business, eliminating a would-be competitor that might have helped increase access to the law.  

Similarly, shortly after Getty Images—a billion-dollar stock images company that owns hundreds of millions of images—filed a copyright lawsuit asking the court to order the “destruction” of Stable Diffusion over purported copyright violations in the training process, Getty introduced its own AI image generator trained on its own library of images.  

Requiring developers to license AI training materials benefits tech monopolists as well. For giant tech companies that can afford to pay, pricey licensing deals offer a way to lock in their dominant positions in the generative AI market by creating prohibitive barriers to entry. To develop a “foundation model” that can be used to build generative AI systems like ChatGPT and Stable Diffusion, developers need to “train” the model on billions or even trillions of works, often copied from the open internet without permission from copyright holders. There’s no feasible way to identify all of those rightsholders—let alone execute deals with each of them. Even if these deals were possible, licensing that much content at the prices developers are currently paying would be prohibitively expensive for most would-be competitors.  

We should not assume that the same companies who built this world can fix the problems they helped create; if we want AI models that don’t replicate existing social and political biases, we need to make it possible for new players to build them. 

Nor is pro-monopoly regulation through copyright likely to provide any meaningful economic support for vulnerable artists and creators. Notwithstanding the highly publicized demands of musicians, authors, actors, and other creative professionals, imposing a licensing requirement is unlikely to protect the jobs or incomes of the underpaid working artists that media and entertainment behemoths have exploited for decades. Because of the imbalance in bargaining power between creators and publishing gatekeepers, trying to help creators by giving them new rights under copyright law is, as EFF Special Advisor Cory Doctorow has written, like trying to help a bullied kid by giving them more lunch money for the bully to take.  

Entertainment companies’ historical practices bear out this concern. For example, in the late-2000’s to mid-2010’s, music publishers and recording companies struck multimillion-dollar direct licensing deals with music streaming companies and video sharing platforms. Google reportedly paid more than $400 million to a single music label, and Spotify gave the major record labels a combined 18 percent ownership interest in its now-$100 billion company. Yet music labels and publishers frequently fail to share these payments with artists, and artists rarely benefit from these equity arrangements. There is no reason to believe that the same companies will treat their artists more fairly once they control AI. 

Threats to Free Expression 

Generative AI tools like text and image generators are powerful engines of expression. Creating content—particularly images and videos—is time intensive. It frequently requires tools and skills that many internet users lack. Generative AI significantly expedites content creation and reduces the need for artistic ability and expensive photographic or video technology. This facilitates the creation of art that simply would not have existed and allows people to express themselves in ways they couldn’t without AI.  

Some art forms historically practiced within the African American community—such as hip hop and collage—have a rich tradition of remixing to create new artworks that can be more than the sum of their parts. As professor and digital artist Nettrice Gaskins has explained, generative AI is a valuable tool for creating these kinds of art. Limiting the works that may be used to train AI would limit its utility as an artistic tool, and compound the harm that copyright law has already inflicted on historically Black art forms. 

Generative AI has the power to democratize speech and content creation, much like the internet has. Before the internet, a small number of large publishers controlled the channels of speech distribution, controlling which material reached audiences’ ears. The internet changed that by allowing anyone with a laptop and Wi-Fi connection to reach billions of people around the world. Generative AI magnifies those benefits by enabling ordinary internet users to tell stories and express opinions by allowing them to generate text in a matter of seconds and easily create graphics, images, animation, and videos that, just a few years ago, only the most sophisticated studios had the capability to produce. Legacy gatekeepers want to expand copyright so they can reverse this progress. Don’t let them: everyone deserves the right to use technology to express themselves, and AI is no exception.  

Threats to Fair Use 

In all of these situations, fair use—the ability to use copyrighted material without permission or payment in certain circumstances—often provides the best counter to restrictions imposed by rightsholders. But, as we explained in the first post in this series, fair use is under attack by the copyright creep. Publishers’ recent attempts to impose a new licensing regime for AI training rights—despite lacking any recognized legal right to control AI training—threatens to undermine the public’s fair use rights.  

By undermining fair use, the AI copyright creep makes all these other dangers more acute. Fair use is often what researchers and educators rely on to make their academic assessments and to gather data. Fair use allows competitors to build on existing work to offer better alternatives. And fair use lets anyone comment on, or criticize, copyrighted material.  

When gatekeepers make the argument against fair use and in favor of expansive copyright—in court, to lawmakers, and to the public—they are looking to cement their own power, and undermine ours.  

A Better Way Forward 

AI also threatens real harms that demand real solutions.  

Many creators and white-collar professionals increasingly believe that generative AI threatens their jobs. Many people also worry that it enables serious forms of abuse, such as AI-generated nonconsensual intimate imagery, including of children. Privacy concerns abound, as does consternation over misinformation and disinformation. And it’s already harming the environment.  

Expanding copyright will not mitigate these harms, and we shouldn’t forfeit free speech and innovation to chase snake oil “solutions” that won’t work.  

We need solutions that address the roots of these problems, like inadequate protections for labor rights and personal privacy. Targeted, issue-specific policies are far more likely to succeed in resolving the problems society faces. Take competition, for example. Proponents of copyright expansion argue that treating AI development like the fair use that it is would only enrich a handful of tech behemoths. But imposing onerous new copyright licensing requirements to train models would lock in the market advantages enjoyed by Big Tech and Big Media—the only companies that own large content libraries or can afford to license enough material to build a deep learning model—profiting entrenched incumbents at the public’s expense. What neither Big Tech nor Big Media will say is that stronger antitrust rules and enforcement would be a much better solution. 

What’s more, looking beyond copyright future-proofs the protections. Stronger environmental protections, comprehensive privacy laws, worker protections, and media literacy will create an ecosystem where we will have defenses against any new technology that might cause harm in those areas, not just generative AI. 

Expanding copyright, on the other hand, threatens socially beneficial uses of AI—for example, to conduct scientific research and generate new creative expression—without meaningfully addressing the harms.  

Originally posted to the EFF’s Deeplinks blog.