Slaying Framework Data-Dependent Errors Dragon 🐉

Laith Sakka (@laithsakka) ¡ October 29, 2025 ¡ 5 min read
dynamic_shapesunbackedddeexportguard_freetorch.compile

TL;DR – Framework DDE dragon has been slain! 🐉 We’ve eliminated the vast majority of framework data-dependent errors — reducing user issues by over 85% — and unlocked specialization-free full graph capture that just works. This lays the groundwork for emerging unbacked use cases in vLLM, MoE graphs, and PT2-Frontier.

Tackling Data-Dependent Errors

Data-dependent errors (DDEs) have long been a major pain point for framework export users, as detailed in the previous post. Six months ago, we launched an initiative to eliminate these issues by implementing explicit unbacked semantics — explicitly defining how code should behave when inputs are unbacked.

That work is now complete. We’ve moved into the maintenance phase, and many previously error-prone operations—such as reshaping, slicing and narrowing, selection, contiguity checks, and broadcasting checks—are now fully DDE-free. In total we addressed 270+ code branches. And the old, complex guard_size_oblivious/size-like mechanism has been completely deprecated.

This marks a major milestone: we can now capture specialization-free graphs much more reliably, providing a smoother and more predictable user experience. The growing number of use cases leveraging unbacked dynamic shapes—like deterministic compilation, vLLM, pre-compile APIs, and PT2-Frontier—highlights the importance of specialization-free graphs.

What This Means for Users and Developers?

1. Improved User & Developer Experience

2. Reduced Technical Complexity

The previous guard_size_oblivious / size-like system was the first step toward eliminating DDEs, and our work was significantly influenced by it. However, it often made the code’s behavior difficult to reason about and introduced multiple layers of technical overhead to maintain:

Size-like annotation and propagation: Users had to manually call _check_size() to mark size-like dimensions, and the framework then had to correctly propagate those annotations across operations. Any missed annotation or propagation failure broke the system’s guarantees around DDE elimination.

Dependence on symbolic reasoning: The system relied on a hint-free symbolic evaluator to infer relationships among dynamic shapes. DDE elimination depended on the evaluator’s ability to reason correctly about these relationships under certain input constraints; if inference failed or remained incomplete, DDEs would persist.

With explicit unbacked semantics, all of this complexity has been removed: No manual _check_size() calls. No propagation of “size-likeness.” No reliance on symbolic evaluation. The result is a simpler, more deterministic, and more predictable system — that achieves better DDE elimination.

3. Enabling Sound, Non-Constrained Graphs

In the past, users often had to insert torch._check calls to constrain the graph and avoid DDEs, then manually remove or ignore those checks later to generalize exported graphs. It was a fragile and frustrating workaround.

With unbacked semantics, that’s no longer necessary. Users can now produce fully general, unconstrained graphs directly—without resorting to these manual hacks.

What’s Next for Unbacked Dynamic Shapes

Support for unbacked dynamic shapes remains a key theme in our dynamic shapes roadmap—especially as their importance grows with upcoming features such as deterministic compilation, compile-on-one-rank, PT2-Frontier, and ensuring vLLM soundness.

There’s still significant work ahead. Our focus remains on advancing support for unbacked shapes while continuing to address urgent user needs in distributed settings. Key remaining areas include:

Thanks!

A big thank you to Brian Hirsh (@bdhirsh) for the long discussions and early-stage guidance that helped shape this project. Similar thanks goes to Bob Ren (@bobrenjc93) and Aaron Orenstein (@aorenste) for their support through long discussion and diff reviews.

Pian Pawakapan (@pianpwk) deserves special recognition for addressing DDEs across several operations — notably slicing, stride ordering, expand — and for leading the exportability benchmark, identifying crucial DDE sources along the way.

Finally, Colin Peppler (@colinpeppler) has been instrumental in continuously reporting and tracking user DDE issues in addition to addressing many DDEs in many ops.

References