← Home AI & LLM topic Latent Space: The AI Engineer Podcast

[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI

Latent Space: The AI Engineer Podcast

Published: December 31, 2025
Duration: 27:33
Summary source: description
Last updated: May 31, 2026

Discusses openai, safety-alignment.

Summary

From pre-training data curation to shipping GPT-4o, o1, o3, and now GPT-5 thinking and the shopping model, Josh McGrath has lived through the full arc of OpenAI's post-training evolution—from the PPO vs DPO debates of 2023 to today's RLVR era, where the real innovation isn't optimization methods but data quality, signal trust, and token efficiency. We sat…

Intelligent Report

Show notes

Themes

openai
safety-alignment

openai safety-alignment

Episode on publisher's site ↗Original audio (RSS) ↗Apple Podcasts (show) ↗Official site ↗