Skip to main content
Ctrl+K

Torch-TensorRT

  • Installation
  • User Guide
  • Advanced Usage
  • Model Zoo
  • API Reference
    • Debugging
    • Contributing
    • Legacy Frontends
  • GitHub
  • Installation
  • User Guide
  • Advanced Usage
  • Model Zoo
  • API Reference
  • Debugging
  • Contributing
  • Legacy Frontends
  • GitHub

Section Navigation

  • Torch-TensorRT Explained
  • Compilation
    • TensorRT Backend for torch.compile
    • Example: Torch Compile Advanced Usage
    • Compiling Exported Programs with Torch-TensorRT
    • CompilationSettings Reference
    • Dynamic shapes with Torch-TensorRT
    • Example: Compiling Models with Dynamic Input Shapes
    • Handling Unsupported Operators
  • Precision & Quantization
    • Compile Mixed Precision models with Torch-TensorRT
    • An example of using Torch-TensorRT Autocast
    • Quantization (INT8 / FP8 / FP4)
    • Deploy Quantized Models using Torch-TensorRT
  • Runtime & Serialization
    • Deploying Torch-TensorRT Programs
    • Runtime API
    • DLA
    • Saving models compiled with Torch-TensorRT
    • Extracting a Raw TensorRT Engine
    • AOTInductor Deployment
    • MutableTorchTensorRTModule
    • Example: Saving and Loading Models with Dynamic Shapes
    • Example: Saving Models with Dynamic Shapes - Both Methods
  • Performance Tuning Guide
  • User Guide
  • Compilation

Compilation#

How Torch-TensorRT compiles models: the JIT torch.compile path, the AOT torch.export path, compilation settings, and dynamic shape configuration.

  • TensorRT Backend for torch.compile
  • Example: Torch Compile Advanced Usage
  • Compiling Exported Programs with Torch-TensorRT
  • CompilationSettings Reference
  • Dynamic shapes with Torch-TensorRT
  • Example: Compiling Models with Dynamic Input Shapes
  • Handling Unsupported Operators

previous

Torch-TensorRT Explained

next

TensorRT Backend for torch.compile

Edit on GitHub
Show Source

© Copyright 2024, NVIDIA Corporation.