← Home AI & LLM topic The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)Guest profile

Maohao Shen

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Teaching LLMs to Self-Reflect with Reinforcement Learning with Maohao Shen

Published: April 8, 2025
Duration: 51:45
Summary source: description
Last updated: Jun 7, 2026

Discusses llm.

Summary

Today, we're joined by Maohao Shen, PhD student at MIT to discuss his paper, “Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search.” We dig into how Satori leverages reinforcement learning to improve language model reasoning—enabling model self-reflection, self-correction, and exploration of alternat…

Intelligent report

Show notes

Themes

llm

Episode on publisher's site ↗Original audio (RSS) ↗Apple Podcasts (show) ↗Official site ↗