Outcry Grows Over AI Companies and Who Controls Internet's Content

AI companies, including OpenAI (backed by Microsoft) and Google, have built generative-AI systems like ChatGPT

... – like ChatGPT by scraping vast amounts of information from the internet to train their algorithms. This includes data from authors, artists, and internet publishers.

– Thousands of authors, including Margaret Atwood and James Patterson, signed an open letter demanding that top AI companies obtain permission and compensate writers for using their works to train AI models.

– Comedian Sarah Silverman and other authors filed lawsuits against OpenAI and Facebook-parent Meta Platforms, alleging that their AI models were trained on illegal copies of their copyrighted books found on the internet.

– News publishers are also concerned about the unlicensed use of their content for AI training, and some have been exploring ways to be compensated by tech companies for this use.

– The Associated Press and OpenAI announced a licensing deal, and Reddit has begun charging for some access to its content in response to concerns over AI companies scraping data.

– Elon Musk has blamed AI companies scraping "vast amounts of data" for Twitter's decision to limit the number of tweets some users can view.

– The tension highlights a broader rethinking of the value of writing and online content and its relationship with tech companies investing in AI technologies.

– AI models often use books as training data, but the companies haven't disclosed all the books used, and some authors suspect their copyrighted works were used without permission.

– OpenAI and Google claim to use "publicly available" information, which can include paywalled and pirated content, and they have expressed a willingness to discuss compensation with content creators.

The legal challenges could lead to new limits or increased costs for accessing data, impacting the business equation for AI tools.

– Courts may require licensing or retroactive payments for copyright materials used to train AI models.

– Guild has been in discussions with tech CEOs about possible payment for training already done and licensing deals for authors, but the issue requires cooperation from all AI firms.

– Concerns also exist that AI systems could replace certain professions like screenwriters, journalists, and novelists, potentially leading to further debates about job displacement and new job creation.

– The future of AI technologies may depend on the continuous access to fresh data, which could affect the market for human-created content.

– AI companies have pointed to the legal doctrine of fair use, arguing that free access to information is essential for AI systems that learn similarly to people.

– There have been other legal challenges to generative AI in the past, including issues with reproducing licensed code without credit and scraping websites violating privacy rights and copyrights.

– The use of AI in this context raises complex questions about the ownership of data, copyright protection, and fair compensation for content creators.