How to train your data

The Vergecast

Published: June 25, 2026
Duration: 26:41
Summary source: description
Last updated: Jul 5, 2026

Discusses openai, anthropic, google-ai.

Summary

Training data is the raw material of the AI industry. Claude, ChatGPT, Gemini, and the rest are built on top of oceans of stuff. What is that stuff? Books. Blog posts. YouTube videos. Reddit comments. All of it and more, in virtually incomprehensible quantities. Alex Reisner, a staff writer at The Atlantic who has been investigating training data, explain…

Intelligent Report

Show notes

Themes

openai
anthropic
google-ai

openai anthropic google-ai

Original audio (RSS) ↗Apple Podcasts (show) ↗Official site ↗