Cover art for The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Charles Martin

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin

Published
June 5, 2025
Duration
1h 25m
Summary source
description
Last updated
Jul 5, 2026

Discusses Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher,…

Summary

Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher, an open-source tool for analyzing and improving Deep Neural Networks (DNNs) based on principles from theoretical physics. We explore the foundations of the Heavy-Tailed Self-Regularization (HTSR) theory that underpins it, which combines random matrix theor…

Intelligent Report

Sign in to read teasers, or upgrade to Research Pro to commission intelligent report for this episode. Learn more →

Show notes

Today, we're joined by Charles Martin, founder of Calculation Consulting, to discuss Weight Watcher, an open-source tool for analyzing and improving Deep Neural Networks (DNNs) based on principles from theoretical physics. We explore the foundations of the Heavy-Tailed Self-Regularization (HTSR) theory that underpins it, which combines random matrix theory and renormalization group ideas to uncover deep insights about model training dynamics. Charles walks us through WeightWatcher’s ability to d