OpenAI Might Have Used Millions Of YouTube Videos To Train Its AI Model: All Details

Apr 8, 2024 6:02 PM, 5 news, 5 views

OpenAI might have used more than a million hours of transcribed data from YouTube videos to train its latest artificial intelligence (AI) model GPT-4, claims a report. It further states that the ChatGPT maker was forced to procure data through YouTube as it had exhausted its entire supply of text-word resources to train its AI models. The allegation, if true, can lead to new problems for the AI firm which is already fighting multiple lawsuits for using copyrighted data. Notably, a report last month highlighted that its GPT Store contained mini chatbots that violated the company's guidelines.

In a report, The New York Times claimed that after running out of sources with unique text words to train its AI models, the company developed an automatic speech recognition tool called Whisper to use it to transcribe YouTube videos and train its models using the data. OpenAI launched Whisper publicly in September 2022, and the AI firm said it was trained on 6,80,000 hours of “multilingual and multitask supervised data collected from the web”.

The report further alleges, citing unnamed sources familiar with the matter, that the OpenAI employees discussed whether using YouTube's data could breach the platform's guidelines and land them in legal trouble. Notably, Google prohibits the usage of videos for applications that are independent of the platform.

Read on Ndtv ›

OpenAI Might Have Used Millions Of YouTube Videos To Train Its AI Model: All Details

Related News

Aamir Khan And Rajkumar Hirani Reunite For Dadasaheb Phalke Biopic

India ranked at 130 out of 193 on human development index marking improvement of 3 points

Latest events

Top Army officer says India can strike all of Pak: They’ll need deep hole to hide

bengaluru-imd-red-alert-heavy-rain-thunderstorm-warning-may-2024

India not a dharamshala: Supreme Court junks Sri Lankan’s plea against deportation

4 arrested in Erode double murder case

Publish advertorials