Cover art for The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Chengzu Li

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li

Published
March 10, 2025
Duration
42:11
Summary source
description
Last updated
Jul 5, 2026

Discusses multimodal.

Summary

Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the…

Intelligent Report

Sign in to read teasers, or upgrade to Research Pro to commission intelligent report for this episode. Learn more →

Show notes

Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the MVoT framework along with its various task environments—maze, mini-behavior, and frozen lake. We explore token discrepancy loss, a technique d

Themes

  • multimodal