Results for "Interpretability"

3 results

Episodes

StandardSummaries only
The First Mechanistic Interpretability Frontier Lab — Myra Deng & Mark Bissell of Goodfire AI
Latent Space: The AI Engineer Podcast· Feb 6, 2026
From Palantir and Two Sigma to building Goodfire into the poster-child for actionable mechanistic interpretability, Mark Bissell (Member of Technical Staff) and Myra Deng (Head of Product) are trying to turn “peeking ins…
ai
StandardSummaries only
Exploring the Biology of LLMs with Circuit Tracing with Emmanuel Ameisen
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)· Emmanuel Ameisen· Apr 14, 2025
In this episode, Emmanuel Ameisen, a research engineer at Anthropic, returns to discuss two recent papers: "Circuit Tracing: Revealing Language Model Computational Graphs" and "On the Biology of a Large Language Model." …
llmanthropicneural-nets
StandardSummaries only
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
Lex Fridman Podcast· Nov 11, 2024
Dario Amodei is the CEO of Anthropic, the company that created Claude. Amanda Askell is an AI researcher working on Claude’s character and personality. Chris Olah is an AI researcher working on mechanistic interpretabili…
anthropicsocietyculture