PyTorch DevLog
A place for PyTorch developers to share what they’re building — new tools, ongoing projects, design decisions, performance wins, and technical deep dives. Written by the people doing the work, for anyone who wants to follow along.
Most Recent 10 Logs
1. mergedog: shepherding approved PRs into pytorch/pytorch CI
Disclosure. This post was drafted by Claude (Anthropic’s coding assistant) with editing from ezyang. mergedog is an entirely vibe-coded small Python harness that takes one approved pytorch/pytorch PR and shepherds it through CI to the point a human can comment @pytorchbot merge. The idea is to use LLMs to deal with some aspects of the drudgery of landing PRs from external contributors: Pressing …
Read more →2. How Does the Dispatcher Work? Dispatcher
I wanted to write about how PT2 does autograd, but that requires understanding eager autograd, which requires understanding the dispatcher. So let’s start there. Let’s Build Ourselves A Dispatcher Let’s pretend we’re building Torch. Let’s start from first principles with the problems we encounter and how to solve them. Problem 1: We want to be able to call operators for each backend. Solution: …
Read more →3. Unbacked Dynamic Shapes Shouldn't Be Slower — Now They Aren't Dynamic Shapes
TL;DR – Unbacked dynamic shapes had 2x–20% slowdowns on TorchBench and ~30% regressions on vLLM. We fixed the root causes — now unbacked matches backed across all tested models and configurations. Motivation These regressions were blocking adoption in Frontier workloads like vLLM. Demand for unbacked shapes is growing — just in the past week, multiple users needed them to control recompilations — …
Read more →4. Reducing Compile-Time Overhead in Unbacked-Symbol-Heavy torch.export Traces Dynamic Shapes
TL;DR – A regression report revealed that exporting a model with many unbacked (data-dependent) symbols took 264s. Profiling showed the latency was dominated by repeated symbolic reasoning in the shape system. A series of targeted, generally applicable optimizations reduced tracing time to 87s (~3x faster). Background A report indicated a severe slowdown when exporting a model that heavily uses …
Read more →5. Backed to Unbacked: From Guardable to Guardless Shapes in PyTorch Dynamic Shapes
TL;DR – We expect unbacked dynamic shapes to become the dominant shape mechanism for Frontier-style workloads due to their better predictability and controllability. However, some blockers remain for their ideal usage, most notably the performance gap, which is a primary focus for the first half of 2026. Origins Recently, unbacked dynamic shapes have become a hot topic. But many people still …
Read more →6. Slaying Framework Data-Dependent Errors Dragon 🐉 Dynamic Shapes
TL;DR – Framework DDE dragon has been slain! 🐉 We’ve eliminated the vast majority of framework data-dependent errors — reducing user issues by over 85% — and unlocked specialization-free full graph capture that just works. This lays the groundwork for emerging unbacked use cases in vLLM, MoE graphs, and PT2-Frontier. Tackling Data-Dependent Errors Data-dependent errors (DDEs) have long been a …
Read more →7. Guard-Free Dynamic Shapes Dynamic Shapes
TL;DR – Data-dependent errors (DDEs) are the dominant barrier to exporting models with dynamic shapes. There is widespread consensus that DDEs are a significant issue for export — among the various errors observed, data-dependent errors are the most dominant. We launched an initiative to eliminate them via explicit unbacked semantics — explicitly defining how code should behave when inputs are …
Read more →Topics
- CI · 1 blogs
- Dispatcher — PyTorch dispatcher, dispatch keys, operator registry, and extensibility · 1 blogs
- Dynamic Shapes · 5 blogs
- Distributed — FSDP, DTensor, c10d, and distributed training · 0 blogs
- Dynamo · 0 blogs
- Export — torch.export and AOTInductor · 0 blogs
- Inductor · 0 blogs