
Why Video Agent models are next — Ethan He, xAI Grok Imagine
Latent Space: The AI Engineer Podcast
- Published
- June 1, 2026
- Duration
- 1h 43m
- Summary source
- description
- Last updated
- Jun 10, 2026
Discusses ai.
Summary
We’re announcing AIEWF speakers this week! Take the AI Engineering Survey!Today’s guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not fro…
Intelligent report
Sign in to read teasers, or upgrade to Research Pro to commission a new dossier for this episode. Learn more →
Show notes
We’re announcing AIEWF speakers this week! Take the AI Engineering Survey!Today’s guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Int
Themes
- ai