Latest news with #USCopyrightOffice


Arabian Post
a day ago
- Business
- Arabian Post
AI Copyright Quietly Redrawing Legal Lines
Twelve consolidated copyright suits filed by US authors and news outlets against OpenAI and Microsoft have landed in the Southern District of New York, elevating the question of whether the extent of human input in AI training crosses the threshold of lawful fair use. The judicial panel cited shared legal and technical claims involving unauthorised use of copyrighted material, notably books and newspapers, as justifying centralised legal proceedings. The US Copyright Office added its authoritative voice in May, questioning whether AI training on copyrighted texts can be deemed fair use, particularly in commercial contexts. The Office clarified that while transformative use may be permissible in research, mass replication or competition with original works likely exceeds established boundaries. Its report highlighted that the crux lies in purpose, source, market impact and guards on output — variables which may render AI models liable under copyright law. A pivotal case involving Thomson Reuters and Ross Intelligence offers early legal clarity: a court ruled that Ross improperly used Westlaw content, rejecting its fair use defence. The judgement centred on the need for AI systems to 'add something new' and avoid copying wholesale, reinforcing the rights of content owners. This ruling is being cited alongside the US Copyright Office's latest guidance as foundational in shaping how courts may assess generative AI. ADVERTISEMENT Legal practitioners are now navigating uncharted terrain. Lawyers such as Brenda Sharton from Dechert and Andy Gass of Latham & Watkins are at the cutting edge in helping judges understand core AI mechanics — from training data ingestion to output generation — while balancing copyright protection and technological progress. Their work emphasises that this lifetime of litigation may not be resolvable in a single sweeping judgment, but will evolve incrementally. At the heart of many discussions lies the condition for copyright protection: human authorship. The US Copyright Office reaffirmed in a February report that merely issuing a prompt does not satisfy the originality requirement. It stated that current systems offer insufficient control for human authors to claim sole credit, and that copyright should be considered case‑by‑case, grounded in Feist's minimum creativity standard. Critics argue this stance lacks clarity, as no clear threshold for the level of human input has been defined. Certain jurisdictions are taking diverse approaches. China's Beijing Internet Court recently ruled in Li v Liu that an AI–generated image was copyrightable because the plaintiff had provided substantial prompts and adjustments — around 30 prompts and over 120 negative prompts — demonstrating skill, judgment and aesthetic choice. In the United Kingdom, the Copyright, Designs and Patents Act 1988 attributes authorship to the person who undertakes 'arrangements necessary' for a computer‑generated work, hinting that both programmers and users may qualify as authors depending on context. In contrast, India's legal framework remains unsettled. Courts have emphasised human creativity in ruling on computer‑generated works, as seen in Rupendra Kashyap v Jiwan Publishing and Navigators Logistics Ltd v Kashif Qureshi. ANI, India's largest news agency, has brought forward a high‑profile case against OpenAI, with hearings held on 19 November 2024 and 28 January 2025. The Delhi High Court has appointed an amicus curiae to navigate this untested area of copyright, with Indian lawyers emphasising that the outcome could shape licensing practices and data‑mining norms. India reserves copyright protection for creations exhibiting 'minimal degree of creativity' under its Supreme Court rulings such as Eastern Book Co v Modak. In February 2025, experts noted that determining whether AI training qualifies as fair dealing or whether generative AI outputs amount to derivative works will be pivotal. Currently, scraping content for AI training falls outside clear exemptions under Indian law, though the Delhi case could catalyse policy reform. ADVERTISEMENT Amid these legal fires, signs point toward statutory intervention. In the US, the Generative AI Copyright Disclosure Act would require developers to notify the Copyright Office of copyrighted works used in training models at least 30 days before public release. While UK policymakers are consulting on a specialised code of practice, India lacks similar formal mechanisms. The evolving legal framework confronts a fundamental philosophical and commercial dilemma: making space for generative AI's potential innovation without undermining creators' rights. AI developers contest that mass text and data mining fuels advanced models, while authors and journalists argue such training must be controlled to safeguard original expression. Courts appear poised to strike a balance by scrutinising the nuance of human input, purpose and impact — not by enacting sweeping exclusions.


New York Post
29-05-2025
- Business
- New York Post
The dirty secret Big Tech doesn't want you to know: AI runs on theft
Artificial intelligence is one of the fastest-growing, most exciting industries in America. Many in the tech industry are confident that as AI continues to improve, it will become increasingly important in our everyday lives. But its growth has come at a cost — and if we're not careful, AI's expansion could end up crippling other critical sectors of the American economy. Big Tech's dirty secret is that the success of its AI tools has been almost entirely built on theft. Advertisement These companies are scraping enormous amounts of copyrighted content, without permission or compensation, to fuel their AI products — and in the process dangerously undermining content creators' businesses. Instead of paying for access to copyrighted material — everything from magazine columns to President Trump's own book 'The Art of the Deal' — most AI companies have made the conscious choice to steal it instead. Advertisement They argue that all content, even content registered for protection with the US Copyright Office, should be considered 'fair use' when used to build and operate AI models. To gather the data that powers their large language models, Big Tech companies have consistently bypassed paywalls, ignored websites' directives asking users not to copy material, and worse. Meta, for instance, used illegal Russia-based pirate site LibGen to copy the contents of at least 7.5 million books to train its Llama AI model — an egregiously unlawful, copyright-violating workaround. Advertisement Relying on questionable sources for AI training purposes poses a variety of serious problems, perhaps even to US national security. Recently an online data watchdog found that many of the most popular AI chatbots have absorbed millions of articles designed to spread Russian propaganda and outright falsehoods. Now infected by a Russian disinformation network known as 'Pravda,' these chatbots, including Grok, ChatGPT and Gemini, mimic Kremlin talking points when asked about certain topics — and spread false narratives about 33% of the time. Content creators, meanwhile, face existential problems. Advertisement In addition to seeing their content stolen for training purposes, publishers are now forced to watch as Big Tech companies make billions using that stolen content in ways that directly compete with publishers' business models. With retrieval-augmented generation, in which an AI model references outside sources before responding to user inquiries, many AI products now give users real-time information, pulled directly from recently published news articles. Those same AI companies run ads against that content — generating revenue that should belong to those who invested in its creation. A user who gets all the information contained within an article directly through an AI chatbot has almost no reason to click through to the original text — much less to buy a subscription, if the item is behind a paywall. The data on this is clear: AI chatbots drive referral traffic at a 96% lower rate than traditional Google search, an already shrunken relic of the past. For every 1,000 people using an AI chatbot, fewer than four will click through to see the original source of the information they're reading. As AI replaces traditional search for many users, this drop in referral traffic will cut deeply into both subscription and advertising revenue for publishers — depriving them of the funds they need to produce the content consumers (and AI companies) rely on. Advertisement AI companies are lobbying to legitimize this behavior, but Washington should take care. Tilting the scales in Big Tech's favor will undermine centuries of intellectual-property protections that have paid tremendous dividends for the United States, giving us countless advancements — and a competitive edge on the world stage. Blessing the theft of American content would instantly erode our country's foundation as an innovation powerhouse. Advertisement The news media industry supports a balanced approach. Many publications and journalists, in fact, now use AI to better serve their customers. But it's important to develop AI products responsibly and in coordination with the creators of the content they use, with the long-term sustainability of both AI companies and creators in mind. If AI companies' theft drives creators out of business, everyone ends up worse off. To protect their work, over a dozen members of the News/Media Alliance recently sued Cohere, Inc., a growing AI company, for unauthorized use of their content. Advertisement They joined a number of other publishers, including News Corp and The New York Times, that are suing various AI companies to enforce their rights. Some in Big Tech are clearly beginning to recognize the problem with unfettered content theft. We've seen a rapid proliferation of licensing agreements, in which AI companies pay publishers to use their content, over the last year. A News/Media Alliance collective is currently licensing content at scale. Advertisement But without reinforced legal protections against content theft, bad actors will continue to exploit publishers and creators — undermining America's creative industries to further tech's own commercial interests. Danielle Coffey is president and CEO of the News/Media Alliance, which represents more than 2,200 publishers nationwide.
Yahoo
13-05-2025
- Business
- Yahoo
The US Copyright Chief Was Fired After Raising Red Flags About AI Abuse
On Friday, the US Copyright Office released a draft of a report finding that AI companies broke the law while training AI. The next day, the agency's head, Shira Perlmutter, was fired — and the alarm bells are blaring. The report's findings were pretty straightforward. Basically, the report explained that using large language models (LLMs) trained on copyrighted data for tasks like "research and analysis" is probably fine, as "the outputs are unlikely to substitute for expressive works used in training." But that changes when copyrighted materials (like books, for example) are used for commercial applications — particularly when those applications compete in the same market as the original works funneled into models for training. Other examples: Using an AI that gets trained on copyrighted journalism, in order to create a news generation tool, or using copyrighted artworks, in order to then create art to sell. That type of use likely breaches fair use protections, according to the report, and "goes beyond established fair use boundaries." The report's findings seem to strike a clear blow to frontier AI companies, who have generally taken the stance that everything ever published by anyone else should also be theirs. OpenAI is fighting multiple copyright lawsuits, including a high-profile case brought by The New York Times, and has lobbied the Trump Administration to redefine copyright law to benefit AI companies; Meta CEO Mark Zuckerberg has taken the stance that others' content isn't really worth enough for his company to have to bother compensating people for it; Twitter founder Jack Dorsey and Twitter-buyer-and-rebrander Elon Musk agreed recently that we should "delete all IP law." Musk is heavily invested in his own AI company, xAI. Clearly, an official report saying otherwise, emerging from the US federal copyright-enforcement agency, stands at odds with these companies and the interests of their leaders. And without a clear explanation for Perlmutter's firing in the interim, it's hard to imagine that issues around AI and copyright — a clear thorn in the side of much of Silicon Valley and, to that end, many of Washington's top funders — didn't play a role. As The Register noted, after the report was published, legal experts were quick to catch how odd it was for the Copyright Office to release it as a pre-print draft. "A straight-ticket loss for the AI companies," Blake. E Reid, a tech law professor at the University of Colorado Boulder, said in a Bluesky post of the report's findings. "Also, the 'Pre-Publication' status is very strange and conspicuously timed relative to the firing of the Librarian of Congress," Reid added, referencing the sudden removal last week of now-former Librarian of Congress Carla Hayden, who was fired on loose allegations related to the Trump Administration's nonsensical war on "DEI" policies. "I continue to wonder (speculatively!)," Reid continued, "if a purge at the Copyright Office is incoming and they felt the need to rush this out." Reid's prediction was made before the removal of Perlmutter, who was named to her position in 2020. To make matters even more bizarre, Wired reported that two men claiming to be officials from Musk's DOGE squad were blocked on Monday while attempting to enter the Copyright Office's building in DC. A source "identified the men as Brian Nieves, who claimed he was the new deputy librarian, and Paul Perkins, who said he was the new acting director of the Copyright Office, as well as acting Registrar," according to the report. The White House has yet to speak on why Perlmutter was fired, and whether her firing had anything to do with Musk and DOGE. It wouldn't be the first time, though, that recent changes within the government have benefited Musk and his companies. More on AI and copyright: Sam Altman Says Miyazaki Just Needs to Get Over It
Yahoo
12-05-2025
- Business
- Yahoo
Copyright Office Punts on AI and Fair Use, One of the Biggest Questions Surrounding Gen AI
If you've been hoping for clarity from the US Copyright Office on AI training -- whether AI companies can use copyrighted materials under fair use or creators can claim infringement -- prepare to be disappointed. The Office released its third report on Friday, and it's not the major win tech companies hoped for, nor the full block some creators sought. The US Copyright Office set out in 2023 to release a series of reports, guidance for creators, dealing with the myriad of legal and ethical issues that arise from AI-generated content from software such as ChatGPT, Gemini, Meta AI and Dall-E. In previous reports, the Copyright Office ruled that entirely AI-generated content can't be copyrighted, while AI-edited content could still be eligible. These reports aren't law, but they give us a picture of how the agency is handling copyright protections in the age of AI. The third report, available now, isn't the final report; it's a "prepublication" version. Still, there won't be any major changes in the Copyright Office's analysis and conclusions in the final report, according to its website, so it does give us a good understanding of the guidance it will offer for future claims. The 108-page report deals primarily with copyright concerns around the training of AI models -- specifically, whether AI companies have legal footing to ask for a fair-use exception, which would let them use copyrighted content without licensing or compensating the copyright holders. In short, the Copyright Office didn't rule out the possibility of a fair-use case for companies using copyrighted material for AI training. But there are a couple of important things that would be a metaphorical strike against a fair-use case, which the report spells out in detail. So it's also possible that an AI company found to be using copyrighted material without the author's permission could be grounds for a copyright infringement claim. It depends on the AI model, how it's used and what it produces. "On one end of the spectrum, uses for purposes of noncommercial research or analysis that do not enable portions of the works to be reproduced in the outputs are likely to be fair," the report says. "On the other end, the copying of expressive works from pirate sources in order to generate unrestricted content that competes in the marketplace … is unlikely to qualify as fair use. Many uses, however, will fall somewhere in between." The Copyright Office, which is part of the Library of Congress, is the subject of current political controversy. CBS News reports that the department's head, Shira Perlmutter, known as the Register of Copyrights, was fired by President Donald Trump this past weekend. This was a few days after Trump fired the Librarian of Congress, Carla Hayden, on Thursday. Hayden was the first woman and first African American to hold the position. Here's what the Copyright Office wrote on fair use and what you need to know about why the legal web of AI and copyright continues to grow. Tech companies have been pushing hard for a fair-use exception. If they're granted such an exception, it won't matter if they have and use copyrighted work in their training datasets. The question of potential copyright infringement is at the center of more than 30 lawsuits, including notable ones like The New York Times v. OpenAI and Ortiz v. Stability AI. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) The Copyright Office said in its report that these cases should still continue in the judicial system: "It is for the courts to weigh the statutory factors together," since there's "no mechanical computation or easy formula" to decide fair use. Writers, actors and other creators have pushed back equally hard against fair use. In an open letter signed by more than 400 of Hollywood's biggest celebrities, the creators ask the administration's office of science and technology policy not to allow fair use. They wrote: "[America's] success stems directly from our fundamental respect for IP and copyright that rewards creative risk-taking by talented and hardworking Americans from every state and territory." For now, it seems we have only a few more answers than before. The big questions around whether specific companies like OpenAI have violated copyright law will have to wait to be adjudicated in court. Fair use is part of the 1976 Copyright Act. The provision grants people who are not the original authors the right to use copyrighted works in specific cases, like for education, reporting and parody. There are four main factors to consider in a fair-use case: one, the purpose of the use; two, the nature of the work; three, the amount and substantiality used; and four, the effect it has on the market. The Copyright Office's report analyzes all four factors in the context of AI training. One important aspect is the transformativeness of the work, which is whether AI chatbots and image generators are creating outputs that are substantially different from the original training content. The report seems to indicate that AI chatbots used for deep research are sufficiently transformative. But image generators that produce outputs in too similar a style or aesthetic to existing work might not be. The report says guardrails that prevent the replication of protected works -- like image generators refusing to create popular logos -- would be evidence that AI companies are trying to avoid infringement. This is despite the fact that the office cites research that proves those guardrails aren't always effective, as OpenAI's Studio Ghibli image trend clearly demonstrated. The report argues that AI-generated content clearly affects the market, the fourth factor. It mentions the possibility of loss of sales, diluting markets through oversaturation and loss of licensing opportunities for existing data markets. However, it also mentions the potential for public benefit from the development of AI products. Licensing, a popular alternative to suing among publishers and owners of content catalogs, is also highlighted as one possible pathway to avoid copyright concerns. Many publishers, including the Financial Times and Axel Springer brands, have struck multimillion-dollar deals with AI companies, giving AI developers access to their high-quality, human-generated content. There are some concerns that if licensing becomes the sole way to attain this data, it will prioritize the big tech companies that can afford to pay for that data, boxing out smaller developers. The Copyright Office writes that those concerns shouldn't have an effect on fair-use analyses and are best dealt with by antitrust laws and the agencies that enforce them, like the Federal Trade Commission.


CNET
12-05-2025
- Business
- CNET
Copyright Office Punts on AI and Fair Use, One of the Biggest Questions Surrounding Gen AI
If you've been hoping for clarity from the US Copyright Office on AI training -- whether AI companies can use copyrighted materials under fair use or creators can claim infringement -- prepare to be disappointed. The Office released its third report on Friday, and it's not the major win tech companies hoped for, nor the full block some creators sought. The US Copyright Office set out in 2023 to release a series of reports, guidance for creators, dealing with the myriad of legal and ethical issues that arise from AI-generated content from software such as ChatGPT, Gemini, Meta AI and Dall-E. In previous reports, the Copyright Office ruled that entirely AI-generated content can't be copyrighted, while AI-edited content could still be eligible. These reports aren't law, but they give us a picture of how the agency is handling copyright protections in the age of AI. The third report, available now, isn't the final report; it's a "prepublication" version. Still, there won't be any major changes in the Copyright Office's analysis and conclusions in the final report, according to its website, so it does give us a good understanding of the guidance it will offer for future claims. The 108-page report deals primarily with copyright concerns around the training of AI models -- specifically, whether AI companies have legal footing to ask for a fair-use exception, which would let them use copyrighted content without licensing or compensating the copyright holders. In short, the Copyright Office didn't rule out the possibility of a fair-use case for companies using copyrighted material for AI training. But there are a couple of important things that would be a metaphorical strike against a fair-use case, which the report spells out in detail. So it's also possible that an AI company found to be using copyrighted material without the author's permission could be grounds for a copyright infringement claim. It depends on the AI model, how it's used and what it produces. "On one end of the spectrum, uses for purposes of noncommercial research or analysis that do not enable portions of the works to be reproduced in the outputs are likely to be fair," the report says. "On the other end, the copying of expressive works from pirate sources in order to generate unrestricted content that competes in the marketplace … is unlikely to qualify as fair use. Many uses, however, will fall somewhere in between." The Copyright Office, which is part of the Library of Congress, is the subject of current political controversy. CBS News reports that the department's head, Shira Perlmutter, known as the Register of Copyrights, was fired by President Donald Trump this past weekend. This was a few days after Trump fired the Librarian of Congress, Carla Hayden, on Thursday. Hayden was the first woman and first African American to hold the position. Here's what the Copyright Office wrote on fair use and what you need to know about why the legal web of AI and copyright continues to grow. Why does fair use matter? Tech companies have been pushing hard for a fair-use exception. If they're granted such an exception, it won't matter if they have and use copyrighted work in their training datasets. The question of potential copyright infringement is at the center of more than 30 lawsuits, including notable ones like The New York Times v. OpenAI and Ortiz v. Stability AI. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) The Copyright Office said in its report that these cases should still continue in the judicial system: "It is for the courts to weigh the statutory factors together," since there's "no mechanical computation or easy formula" to decide fair use. Writers, actors and other creators have pushed back equally hard against fair use. In an open letter signed by more than 400 of Hollywood's biggest celebrities, the creators ask the administration's office of science and technology policy not to allow fair use. They wrote: "[America's] success stems directly from our fundamental respect for IP and copyright that rewards creative risk-taking by talented and hardworking Americans from every state and territory." For now, it seems we have only a few more answers than before. The big questions around whether specific companies like OpenAI have violated copyright law will have to wait to be adjudicated in court. Guidance on deciding on fair use Fair use is part of the 1976 Copyright Act. The provision grants people who are not the original authors the right to use copyrighted works in specific cases, like for education, reporting and parody. There are four main factors to consider in a fair-use case: one, the purpose of the use; two, the nature of the work; three, the amount and substantiality used; and four, the effect it has on the market. The Copyright Office's report analyzes all four factors in the context of AI training. One important aspect is the transformativeness of the work, which is whether AI chatbots and image generators are creating outputs that are substantially different from the original training content. The report seems to indicate that AI chatbots used for deep research are sufficiently transformative. But image generators that produce outputs in too similar a style or aesthetic to existing work might not be. The report says guardrails that prevent the replication of protected works -- like image generators refusing to create popular logos -- would be evidence that AI companies are trying to avoid infringement. This is despite the fact that the office cites research that proves those guardrails aren't always effective, as OpenAI's Studio Ghibli image trend clearly demonstrated. The report argues that AI-generated content clearly affects the market, the fourth factor. It mentions the possibility of loss of sales, diluting markets through oversaturation and loss of licensing opportunities for existing data markets. However, it also mentions the potential for public benefit from the development of AI products. Licensing, a popular alternative to suing among publishers and owners of content catalogs, is also highlighted as one possible pathway to avoid copyright concerns. Many publishers, including the Financial Times and Axel Springer brands, have struck multimillion-dollar deals with AI companies, giving AI developers access to their high-quality, human-generated content. There are some concerns that if licensing becomes the sole way to attain this data, it will prioritize the big tech companies that can afford to pay for that data, boxing out smaller developers. The Copyright Office writes that those concerns shouldn't have an effect on fair-use analyses and are best dealt with by antitrust laws and the agencies that enforce them, like the Federal Trade Commission.