torch.compiler#
Created On: Jul 28, 2023 | Last Updated On: Sep 22, 2025
torch.compiler
is a namespace through which some of the internal compiler
methods are surfaced for user consumption. The main function and the feature in
this namespace is torch.compile
.
torch.compile
is a PyTorch function introduced in PyTorch 2.x that aims to
solve the problem of accurate graph capturing in PyTorch and ultimately enable
software engineers to run their PyTorch programs faster. torch.compile
is
written in Python and it marks the transition of PyTorch from C++ to Python.
torch.compile
leverages the following underlying technologies:
TorchDynamo (torch._dynamo) is an internal API that uses a CPython feature called the Frame Evaluation API to safely capture PyTorch graphs. Methods that are available externally for PyTorch users are surfaced through the
torch.compiler
namespace.TorchInductor is the default
torch.compile
deep learning compiler that generates fast code for multiple accelerators and backends. You need to use a backend compiler to make speedups throughtorch.compile
possible. For NVIDIA, AMD and Intel GPUs, it leverages OpenAI Triton as the key building block.AOT Autograd captures not only the user-level code, but also backpropagation, which results in capturing the backwards pass “ahead-of-time”. This enables acceleration of both forwards and backwards pass using TorchInductor.
To better understand how torch.compile
tracing behavior on your code, or to
learn more about the internals of torch.compile
, please refer to the torch.compile
programming model.
Note
In some cases, the terms torch.compile
, TorchDynamo, torch.compiler
might be used interchangeably in this documentation.
As mentioned above, to run your workflows faster, torch.compile
through
TorchDynamo requires a backend that converts the captured graphs into a fast
machine code. Different backends can result in various optimization gains.
The default backend is called TorchInductor, also known as inductor,
TorchDynamo has a list of supported backends developed by our partners,
which can be seen by running torch.compiler.list_backends()
each of which
with its optional dependencies.
Some of the most commonly used backends include:
Training & inference backends
Backend |
Description |
---|---|
|
Uses the TorchInductor backend. Read more |
|
CUDA graphs with AOT Autograd. Read more |
|
Uses IPEX on CPU. Read more |
Inference-only backends
Backend |
Description |
---|---|
|
Uses Torch-TensorRT for inference optimizations. Requires |
|
Uses IPEX for inference on CPU. Read more |
|
Uses Apache TVM for inference optimizations. Read more |
|
Uses OpenVINO for inference optimizations. Read more |
Read More#
Getting Started for PyTorch Users
- Getting Started
- torch.compiler API reference
- torch.compiler.compile
- torch.compiler.reset
- torch.compiler.allow_in_graph
- torch.compiler.substitute_in_graph
- torch.compiler.assume_constant_result
- torch.compiler.list_backends
- torch.compiler.disable
- torch.compiler.set_stance
- torch.compiler.set_enable_guard_collectives
- torch.compiler.cudagraph_mark_step_begin
- torch.compiler.is_compiling
- torch.compiler.is_dynamo_compiling
- torch.compiler.is_exporting
- torch.compiler.skip_guard_on_inbuilt_nn_modules_unsafe
- torch.compiler.skip_guard_on_all_nn_modules_unsafe
- torch.compiler.keep_tensor_guards_unsafe
- torch.compiler.skip_guard_on_globals_unsafe
- torch.compiler.nested_compile_region
- torch.compiler.config
accumulated_recompile_limit
allow_unspec_int_on_nn_module
assume_static_by_default
automatic_dynamic_shapes
capture_dynamic_output_shape_ops
capture_scalar_outputs
dynamic_shapes
enable_cpp_symbolic_shape_guards
fail_on_recompile_limit_hit
job_id
log_file_name
recompile_limit
reorderable_logging_functions
skip_tensor_guards_with_matching_dict_tags
verbose
wrap_top_frame
- Dynamic Shapes
- TorchDynamo APIs for fine-grained tracing
torch.compile
has different autograd semantics- AOTInductor: Ahead-Of-Time Compilation for Torch.Export-ed Models
- TorchInductor GPU Profiling
- Profiling to understand torch.compile performance
- Finding graph breaks: “Torch-Compiled Region” and “CompiledFunction”
- Frequently Asked Questions
- Does
torch.compile
support training? - Do you support Distributed code?
- Do I still need to export whole graphs?
- Why is my code crashing?
- Why is compilation slow?
- Why are you recompiling in production?
- How are you speeding up my code?
- Why am I not seeing speedups?
- Why am I getting incorrect results?
- Why am I getting OOMs?
- Does
torch.func
work withtorch.compile
(forgrad
andvmap
transforms)? - Does NumPy work with
torch.compile
? - Which API to use for fine grain tracing?
- Does
- torch.compile Troubleshooting
- PyTorch 2.0 Performance Dashboard
- TorchInductor and AOTInductor Provenance Tracking
torch.compile Programming Model
Deep Dive for PyTorch Developers
HowTo for PyTorch Backend Vendors