Cover art for Latent Space: The AI Engineer Podcast

Why Video Agent models are next — Ethan He, xAI Grok Imagine

Latent Space: The AI Engineer Podcast

Published
June 1, 2026
Duration
1h 43m
Summary source
description
Last updated
Jun 10, 2026

Discusses ai.

Summary

We’re announcing AIEWF speakers this week! Take the AI Engineering Survey!Today’s guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not fro…

Intelligent report

Sign in to read teasers, or upgrade to Research Pro to commission a new dossier for this episode. Learn more →

Show notes

We’re announcing AIEWF speakers this week! Take the AI Engineering Survey!Today’s guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Int

Themes

  • ai