OpenAI Might Have Used Millions Of YouTube Videos To Train Its AI Model: All Details

Apr 8, 2024 6:02 PM, 5 news, 4 views

OpenAI might have used more than a million hours of transcribed data from YouTube videos to train its latest artificial intelligence (AI) model GPT-4, claims a report. It further states that the ChatGPT maker was forced to procure data through YouTube as it had exhausted its entire supply of text-word resources to train its AI models. The allegation, if true, can lead to new problems for the AI firm which is already fighting multiple lawsuits for using copyrighted data. Notably, a report last month highlighted that its GPT Store contained mini chatbots that violated the company's guidelines.

In a report, The New York Times claimed that after running out of sources with unique text words to train its AI models, the company developed an automatic speech recognition tool called Whisper to use it to transcribe YouTube videos and train its models using the data. OpenAI launched Whisper publicly in September 2022, and the AI firm said it was trained on 6,80,000 hours of “multilingual and multitask supervised data collected from the web”.

The report further alleges, citing unnamed sources familiar with the matter, that the OpenAI employees discussed whether using YouTube's data could breach the platform's guidelines and land them in legal trouble. Notably, Google prohibits the usage of videos for applications that are independent of the platform.

Read on Ndtv ›

OpenAI Might Have Used Millions Of YouTube Videos To Train Its AI Model: All Details

Related News

X Videos Can Soon Be Streamed on Smart TVs via a New App

OpenAI Signs Deal With Financial Times to Use Its Content to Train AI Models

Latest events

ECI brings out protocol on symbol loading units as mandated by the Supreme Court

Trinamool Congress Removes Kunal Ghosh from General Secretary Post

Land Titling Act is nothing but land grabbing act of Jagan, alleges Pawan Kalyan

President Droupadi Murmu offers prayer at Ram temple in Ayodhya

Publish advertorials