Toward Agent-Friendly Dynamo: Mirroring CPython Semantics
TL;DR – Dynamo’s ad-hoc CPython support creates fragmented graph breaks that are hard to fix — even for LLM agents. By refactoring Dynamo to mirror CPython’s
tp_*slot semantics, we make the system systematically auditable and agent-friendly, already lifting CPython test pass rates from 38% to 45% and proactively eliminating classes of graph breaks in frontier models.
Observations
Working with frontier training frameworks has surfaced some fundamental issues in Dynamo. The issues broadly fell into four categories:
- CPython language gaps: For example, Dynamo supports calling a
functools.partialobject but did not support hashing it. - Insufficient exception messages: One frontier framework had an unusual setup where
torch.compilewas always enabled, even during development. Users expected Dynamo to surface CPython-like errors (e.g., “list is not hashable”), but instead saw generic messages like “found unhashable object.” - Bugs: For instance, Dynamo assumed a property object is always a function, while one framework used a
functools.partial. - PyTorch composability gaps: Incomplete support for
autograd.Function, tensor hooks, FSDP2 etc. This is an important category, but we will not focus on it in this post.
From a torch.compile user’s perspective, these issues are indistinguishable. The common feedback was to improve error messages to enable better self-serve debugging, but that’s not the root problem.
The Deeper Issue
Dynamo loosely mirrors parts of CPython, but this mapping is not always consistent in practice. Two structural patterns stand out:

- Partial support (incomplete “green circles”): Support for CPython features is scattered. For example, we support calling a
functools.partial, but not hashing it. These partial implementations create subtle and sometimes hard-to-debug graph breaks. - VariableTracker overreach (the “red circles”): Dynamo models CPython objects through VariableTracker, which ideally should be limited to C-backed types. Over time, we’ve added trackers for Python-level constructs like
enum.Enumand frozen dataclasses — often to work around missing fundamentals (e.g., metaclass handling). These trackers require significantly more investment, since there is far more behavior to model, and tend to accumulate band-aid solutions — making the system harder to maintain and extend.
The supported Dynamo surface area is “implementation-by-accident” and therefore we don’t even know which parts are unsupported. If we knew, we would fix those gaps, and not write better error messages.
Why LLM agents don’t fix this (yet)
Given the rise of LLM agents in software engineering, it’s natural to apply them to fixing graph breaks. In theory, this should work well: CPython is well-documented, and the task is largely mechanical — map CPython semantics into Dynamo.
In practice, LLM agents have been quite helpful, but not as effective as one might expect for what appears to be a largely mechanical task. For example, while working through an enum-related graph break in a frontier framework, Claude was able to send a PR quickly but it required too many changes and repeatedly ran into CI issues.

This isn’t a limitation of the agents themselves as much as a reflection of the current state of Dynamo. The reason is that Dynamo does not mirror CPython’s data model. As a result, LLM agents end up building on top of an inconsistent and fragmented foundation, amplifying the very issues they’re meant to solve.
Instead of incrementally patching gaps, we’re exploring a different approach:
Mirror CPython’s structure in Dynamo using LLM agents
In CPython, every type derives from PyTypeObject, which defines behavior via tp_* slots implemented as C function pointers. For example, tuple behavior is defined through these slots, with 0 indicating fallback to PyBaseObject_Type (not lack of support).

Dynamo does not model this protocol consistently today. Equivalent tp_* behavior is implemented in ad hoc ways, sometimes duplicated across multiple places. This makes the system difficult to reason about, audit, and extend.
We are refactoring Dynamo to more closely follow tp_slot semantics. The goal is not just to fix isolated gaps, but to move from an “implementation-by-accident” model to one that can be systematically audited against CPython. Note that there are many tp_slots that are not relevant for Dynamo, like tp_traverse etc, so the goal is not 100% mirroring but close enough mirroring that facilitates LLM agent fixes.

This was previously too tedious to scale but LLM agents changed that. CPython’s structure, documentation, and comments map well to this task: translating C-level slot behavior into Dynamo’s abstractions.
Execution
For more details on execution, see the OSS-facing document Bridging CPython-Dynamo Gap using Claude.
In short, the effort is progressing well: several tp_* slots have already been implemented in Dynamo, with many more in flight. This work is being driven by a combination of Dynamo pod members and OSS contributors. If you’re interested in contributing, the document includes clear instructions on how to use LLM agents for this workflow — feel free to pick up and own a slot.
Early Wins and Validation — CPython Tests
We now have ~4–6 weeks of this work landed in PyTorch, and we’re already seeing an impact. Several tp_slot implementations have already landed, each eliminating classes of graph breaks rather than individual instances. A few concrete examples:
- copy.deepcopy: Instead of adding ad hoc support, we now trace into deepcopy and model
__reduce_ex__. This follows the CPython approach — model the core primitive and let everything else compose naturally (PR). - enum.Enum: Previously handled via a VariableTracker (a “red circle”). Missing
tp_getattroand metaclass handling caused multiple graph breaks. Implementingtp_getattroallowed us to removeEnumVariable, fixing many enum-related gaps (PR, PR). - Frozen dataclasses: Another “red circle” case — removing the custom tracker eliminated associated graph breaks (PR).
We’re also seeing validation from CPython–Dynamo tests added by Guilherme (OpenTeams), building on an idea from Richard Zou. These run with fullgraph=True. Each tp_slot PR leads to multiple unexpected test passes, increasing the total pass rate from 38% to 45%. This is a strong signal that we’re fixing future graph breaks preemptively by aligning with CPython, rather than waiting to encounter them in real models.
Overall, this reinforces the direction: mirroring CPython eliminates classes of graph breaks, instead of patching them one by one.
Where this leads
Directionally, we are preparing Dynamo to be agent-friendly by design. If we get this right, extending CPython support in Dynamo becomes structured, repeatable — and eventually, “boring.” That’s the goal: to make this layer predictable enough that we can focus on harder problems.
What’s not covered
This post focuses on graph breaks arising from CPython gaps. However, there’s an equally large — and largely separate — class of issues around PyTorch composability. I’m still forming clearer opinions in that area, and will share more as that work evolves. Stay tuned!