Cross-Compiling for Windows#
torch_tensorrt.dynamo.cross_compile_for_windows() compiles TRT engines on a
Linux x86-64 host and produces an ExportedProgram containing engines that can
be loaded and executed on Windows x86-64 — without requiring a Linux GPU at
inference time.
This is the standard path for teams that build models on Linux (where TRT tooling is more mature) and deploy on Windows (game engines, desktop applications, enterprise software).
Requirements#
Build machine: Linux x86-64 with CUDA and TensorRT installed.
Target machine: Windows x86-64 with a compatible NVIDIA GPU (same or newer CUDA compute capability).
enable_cross_compile_for_windows=Trueis automatically set by this API; do not set it manually oncompile().
The following features are disabled during cross-compilation (they are not available in the Windows TRT runtime or require OS-specific binaries):
Python runtime (
use_python_runtimeis forced toFalse)Lazy engine initialization (
lazy_engine_initis forced toFalse)Engine caching (
cache_built_engines/reuse_cached_enginesdisabled)
Workflow#
Step 1 — Export on the Linux build machine
import torch
import torch_tensorrt
model = MyModel().eval().cuda()
inputs = [torch.randn(1, 3, 224, 224).cuda()]
# Export to ExportedProgram
exp_program = torch.export.export(model, tuple(inputs))
Step 2 — Cross-compile for Windows
trt_gm = torch_tensorrt.dynamo.cross_compile_for_windows(
exp_program,
arg_inputs=inputs,
use_explicit_typing=True, # enabled_precisions deprecated; cast model/inputs to target dtype
)
Step 3 — Save the compiled module
torch_tensorrt.save(trt_gm, "model_windows.ep", arg_inputs=inputs)
Step 4 — Load and run on Windows
Copy model_windows.ep to the Windows machine. Ensure
libtorchtrt_runtime.so / torchtrt_runtime.dll is on the library path.
# On Windows:
import torch_tensorrt
trt_gm = torch_tensorrt.load("model_windows.ep").module()
output = trt_gm(*inputs)
Dynamic Shapes#
Dynamic shapes work the same as in normal compile():
from torch_tensorrt import Input
trt_gm = torch_tensorrt.dynamo.cross_compile_for_windows(
exp_program,
arg_inputs=[
Input(
min_shape=(1, 3, 224, 224),
opt_shape=(4, 3, 224, 224),
max_shape=(16, 3, 224, 224),
)
],
)
Engine Compatibility#
The produced engines are compatible with the same or newer CUDA compute capability
as the GPU used during compilation. Use hardware_compatible=True if the Windows
deployment GPU may have a different architecture within the Ampere+ generation:
trt_gm = torch_tensorrt.dynamo.cross_compile_for_windows(
exp_program,
arg_inputs=inputs,
hardware_compatible=True, # engine runs on Ampere and newer
)
Saving and Loading Cross-Compiled Programs#
The output of cross_compile_for_windows is a standard torch.fx.GraphModule
containing TorchTensorRTModule submodules with Windows-compatible engine bytes.
Save and load via the standard Torch-TensorRT save/load API:
# Save (Linux)
torch_tensorrt.save(trt_gm, "model_windows.ep", arg_inputs=inputs)
# Load (Windows)
trt_gm = torch_tensorrt.load("model_windows.ep").module()
trt_gm(*inputs)
Alternatively, save as a raw .engine file for direct TRT deployment:
engine_bytes = torch_tensorrt.dynamo.convert_exported_program_to_serialized_trt_engine(
exp_program,
arg_inputs=inputs,
enable_cross_compile_for_windows=False, # use cross_compile_for_windows() instead
)
# Note: use cross_compile_for_windows() for the full workflow;
# convert_exported_program_to_serialized_trt_engine() does not support cross-compilation.
Troubleshooting#
AssertionError: cross_compile_for_windows is only supported on Linux x86-64You must run the compilation step on a Linux x86-64 machine. The
@needs_cross_compiledecorator gates this function.- Engine fails to load on Windows
Ensure the TRT version on Windows is ≥ the version used on Linux. Use
version_compatible=Truefor forward compatibility within a TRT major version.- Output mismatch between Linux and Windows
Floating-point results may differ slightly due to different driver/hardware implementations. Use
optimization_level=0on Linux to minimize kernel specialization and improve cross-platform reproducibility.