In June 2023, it was reported that OpenAI, the company behind ChatGPT, had used YouTube data to train some of its AI models. The Information, a technology news publication, cited an anonymous source for this information. OpenAI has not confirmed or denied the report, but if it is true, it would raise some concerns. YouTube's terms of service prohibit using its data for commercial purposes, and OpenAI is a for-profit company. Additionally, using YouTube data without permission could violate copyright law.
What effect this report will have on OpenAI or ChatGPT is unknown. It does, however, draw attention to the possible moral and legal issues associated with utilising massive datasets to train AI algorithms.
The report's other details are as follows:
The information that OpenAI is accused of using was scraped from YouTube, which means that neither YouTube nor the content creators gave their consent to its collection.
Some of the most powerful AI models from OpenAI, such as ChatGPT-3 and Dactyl, were trained using the data.
How much YouTube data OpenAI used or how it used it have not been made public.
There have been a variety of responses to the report. Concerns regarding the possible legal and moral repercussions of OpenAI's activity have been voiced by several.
