
Chengzu Li
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li
- Published
- March 10, 2025
- Duration
- 42:11
- Summary source
- description
- Last updated
- Jul 5, 2026
Discusses multimodal.
Summary
Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the…
Intelligent Report
Sign in to read teasers, or upgrade to Research Pro to commission intelligent report for this episode. Learn more →
Show notes
Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the MVoT framework along with its various task environments—maze, mini-behavior, and frozen lake. We explore token discrepancy loss, a technique d
Themes
- multimodal