Here’s Proof You Can Train an AI Model Without Slurping Copyrighted Content

In 2023, OpenAI told the UK parliament that it was “impossible” to train leading AI models without using copyrighted materials. It’s a popular stance in the AI world, where OpenAI and other leading players have used materials slurped up online to train the models powering chatbots and image generators, triggering a wave of lawsuits alleging copyright infringement.

Two announcements Wednesday offer evidence that large language models can in fact be trained without the permissionless use of copyrighted materials.

A group of researchers backed by the French government have released what is thought to be the largest AI training dataset composed entirely of text that is in the public domain. And

→ Continue reading at WIRED

Similar Articles

Advertisment

Most Popular