Tuesday, April 22, 2025

OpenAI used over a million hours of YouTube videos to train its AI model: Report

Date:

Share post:

Shillong, April 7: Sam Altman-run OpenAI transcribed more than a million hours of YouTube videos to train its AI model called GPT-4, a report has claimed.

The New York Times reported that OpenAI knew this was not legal but “believed it to be fair use”.

“OpenAI president Greg Brockman was personally involved in collecting videos that were used,” according to the report.

An OpenAI spokesperson told The Verge that the company uses “numerous sources including publicly available data and partnerships for non-public data,” to maintain its global research competitiveness.

Google, which owns YouTube, said it has “seen unconfirmed reports” of OpenAI’s activity.

“Both our robots.txt files and Terms of Service prohibit unauthorised scraping or downloading of YouTube content,” the tech giant maintained.

Last year, The Information reported for the first time that OpenAI, which is now backed by Microsoft, trained its AI models on Google-owned YouTube by scrapping its data.

OpenAI “secretly used data from the site (YouTube) to train some of its artificial intelligence models”.

YouTube is the single biggest and richest source of imagery, audio and text transcripts on the web. (IANS)

Related articles

B’luru road rage case: After new CCTV emerges, biker gets bail

Bengaluru, April 22: In a road rage incident involving an Indian Air Force (IAF) officer here, the arrested...

BIS offers internships for 500 students

New Delhi, April 22: The Bureau of Indian Standards (BIS) has announced internship opportunities for 500 students from...

Manipur Police recover 75 stolen vehicles in 5 days

Imphal, April 22: Amid the ethnic violence in Manipur, theft of cars and two-wheelers is one of the...

Ramdev agrees to pull down videos linking ‘Rooh Afza’ with ‘Sharbat Jihad’

New Delhi, April 22: Patanjali founder Baba Ramdev on Tuesday agreed before the Delhi High Court to pull...