Episodes
-
35: The Theorem Machine
Recent advances in foundational models have yielded reasoning systems capable of achieving a gold-medal standard at the International Mathematical Olympiad. We introduce Aletheia, a math research agent that iteratively generates, verifies, and revises solution...
-
34: Spinning to Zero
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate closes a gap that has been open since Claude Shannon defined the theoretical floor for lossy compression in 1948. For nearly eighty years, practical vector quantization methods fell expon...
-
32: The Green Gambit
Nvidia committed $26 billion over five years to building open-weight AI models. This episode examines the strategy: open weights as hardware lock-in, the Nemotron Coalition, NemoClaw agent runtime, the Vera Rubin and Feynman hardware roadmaps, and what it mean...
-
30: The Megatron Problem
Every competitive frontier model going forward is sparse — a Mixture-of-Experts architecture where each token activates only a fraction of the total parameters. That decoupling of parameter count from per-token compute sounds like a free lunch. The engineering...
-
29: In Lockstep
Every LLM-based text-to-speech system shipping today carries a structural flaw: text tokens and audio frames move at incompatible speeds inside the same model, forcing engineers to choose between reliability, quality, and inference cost. Hume AI's TADA: A Gene...
-
27: The Bitter Lesson
Rich Sutton published a 1200-word essay in 2019 arguing that 70 years of AI research proved one thing: general methods leveraging computation always beat human-curated knowledge in the long run. Most researchers disagreed. Then the last five years happened. No...
-
25: The Window
The economics of vulnerability discovery just broke. In twenty minutes, Claude Opus 4.6 found a novel use-after-free memory bug in Firefox — one of the most audited codebases on the internet, backed by millions of CPU hours of continuous fuzzing. That single r...
-
24: From Shadows to Worlds
Language models can quote the manual on a bicycle and still miss a broken chain. Beyond Language Modeling: An Exploration of Multimodal Pretraining argues that this is structural, not incidental: text is a lossy compression of reality, and models trained only ...
-
23: Saguaro: The Algorithm That Doesn't Wait
Speculative decoding already beats autoregressive generation — but it still has a sequential bottleneck: verification must finish before drafting restarts. Saguaro (Speculative Speculative Decoding) breaks that dependency by pre-speculating for likely verifica...
-
22: Qwen's Best Day Was Its Last
On the night Alibaba shipped Qwen3.5 — a 397-billion-parameter sparse mixture-of-experts model with 17B active parameters, a 1M-token context window, and a small-model family the open-source community had been waiting for — they fired the person who built it. ...
-
21: dLLM: Diffusion Gets a Framework
Every major language model in production today — GPT, Claude, Gemini, Llama — generates text the same way: left to right, one token at a time. That sequential assumption has been so productive for so long that most researchers treat it as fixed. A team at UC B...
-
20: DualPath: Breaking the Storage Wall
As AI agents run for hundreds of turns with ninety-five percent KV-cache hit rates, the bottleneck shifts from compute to storage I/O. DualPath from Peking University, Tsinghua, and DeepSeek exploits idle decode-engine storage NICs to load KV-cache via RDMA, a...
-
19: Agents of Chaos
Someone finally ran a proper pentest on autonomous AI agents. Natalie Shapira, David Bau, and thirty researchers deployed LLM agents with persistent memory, email, Discord, and shell access then spent two weeks red-teaming them. Eleven failure modes, every one...
-
18: The $20K Arms That Changed Robotics
The most important robotics breakthrough of the last three years was not a new algorithm or a bigger model. It was making the hardware cheap enough to collect enough data. We trace the ALOHA lineage from a twenty thousand dollar bimanual teleoperation rig in a...
-
17: The Math That Proves You're Human
World ID's proof-of-personhood system went from a centralized iris database to a quantum-secure, open-source cryptographic protocol where no single entity holds biometric data. We walk through the Daugman iris code, Shamir Secret Sharing, Secure Multi-Party Co...
-
16: H-Neurons: The Neurons That Make AI Lie
A team at Tsinghua University claims to have identified the specific neurons that predict when a large language model is about to hallucinate. Less than 0.1% of MLP neurons, identified via sparse logistic regression, generalize across domains and even detect f...
-
15: The Age Reversal Trial: Sinclair, Hype, and the Eye of the Storm
The FDA has cleared the first-ever human trial of a therapy designed to partially reverse cellular aging. Life Biosciences' ER-100, an epigenetic reprogramming treatment using a subset of Yamanaka factors delivered via AAV vector, will be injected into the eye...
-
14: Writing Data in Glass — Microsoft Project Silica and the 10,000-Year Storage Problem
Microsoft Research published a complete system for writing data into borosilicate glass using femtosecond lasers. A palm-sized square holds nearly 5TB and survives for over 10,000 years. This episode traces the 30-year journey from Eric Mazur to Project Silica...
-
13: Fast KV Compaction via Attention Matching
MIT researchers propose compressing LLM context in latent space rather than token space. Using closed-form linear algebra instead of gradient descent, Attention Matching achieves 50x KV cache compression in seconds — dramatically outperforming summarization on...
-
12: Kolmogorov Complexity — Sunday Greatest Hits
The only full textbook on Ilya Sutskever's famous reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim. We cover Kolmo...
-
11: DreamZero — World Action Models are Zero-shot Policies
NVIDIA introduces DreamZero, a 14-billion parameter World Action Model that jointly predicts future video and robot actions from a video diffusion backbone. Unlike Vision-Language-Action models that fail on physically novel tasks, DreamZero achieves over 2x im...
-
10: DeepMind Dispatch #1: From Autonomous Mathematicians to AI Musicians
Our first DeepMind Dispatch covers three papers: Aletheia — a system that generates and verifies mathematical proofs autonomously; advances in Hutter optimization for large-scale model training; and Lyria 3, DeepMind's latest music generation model. We break d...
-
9: BitDance: Scaling Autoregressive Generative Models with Binary Tokens
We present BitDance, a scalable autoregressive (AR) image generator that predicts binary visual tokens instead of codebook indices. With high-entropy binary latents, BitDance lets each token represent up to 2^256 states, yielding a compact yet highly expressiv...
-
8: SkillRL: Don't Give Agents Memories, Give Them Skills
SkillRL from UNC Chapel Hill achieves 89.9% on ALFWorld with a 7B model — beating GPT-4o by 41.9 points. The secret: distilling raw experience into compact, reusable skills instead of storing verbose trajectory memories.
-
7: ΔBelief-RL: Rethinking How AI Learns to Act
We explore a bold new framework that rethinks reinforcement learning from the ground up — replacing reward maximization with belief updating, and asking whether AI agents should learn the way scientists do.
-
6: Building a Robot Mind in the Open
Alibaba DAMO Academy built a complete embodied AI system in six months — eyes, hands, imagination, unified brain — and open-sourced everything. Seven model checkpoints, Apache 2.0, zero gating. This is the story of RynnBrain.
-
5: From Blood Sacrifice to Universal Translator
In July 2024, a French nonprofit's open-source voice AI went viral for demanding human sacrifice mid-conversation. Seven months later, the same team used the same architecture to build a real-time speech translator that runs on your phone. This is the story of...
-
4: The Week China Open-Sourced The Frontier
In a 48-hour span, three Chinese AI labs independently released frontier-class open-weight models. Step 3.5 Flash from StepFun delivers frontier intelligence with just 11 billion active parameters. MiniMax M2.5 offers comparable performance at one-twentieth th...
-
3: DreamDojo — Teaching Robots to Dream
Researchers from UC Berkeley, NVIDIA, and UT Austin introduce DreamDojo, a framework that teaches robots physical skills by learning from large-scale human videos. Instead of expensive robot-specific data, DreamDojo distills 5 years of human video into a gener...
-
2: Generative Modeling via Drifting — One-Step Image Generation
Researchers from MIT and Harvard propose Drifting Models, a new paradigm for generative modeling that achieves state-of-the-art image generation in a single forward pass. Instead of iterating at inference time like diffusion models, Drifting Models evolve the ...
-
1: Attention Is All You Need — The Paper That Changed Everything
In our inaugural episode, we dive deep into Attention Is All You Need — the 15-page paper from June 2017 that introduced the Transformer architecture and reshaped all of artificial intelligence. We break down how it works, why the title is a Beatles joke, and ...