Skip to main content

Ctrl+K

Torch-TensorRT

Installation
User Guide
Advanced Usage
Model Zoo
API Reference

GitHub

Installation
User Guide
Advanced Usage
Model Zoo
API Reference
Debugging
Contributing
Legacy Frontends

GitHub

Section Navigation

HuggingFace Models
Extensibility
Resource & Memory Management
Compilation & Graph Analysis
Weight Refitting & LoRA
- Refitting TensorRT Engines with Updated Weights
- Example: Refitting Programs with New Weights
Runtime Optimization
Deployment
Complex Numerics
- Complex Tensor Support
- 3D Rotary Position Embedding (RoPE) + Attention compiled with Torch-TensorRT
Example: Distributed Inference
Operators Supported

Advanced Usage
Weight...

Weight Refitting & LoRA#

Update compiled TensorRT engine weights without recompilation — for LoRA adapters, fine-tuned checkpoints, and EMA weight updates.

Refitting TensorRT Engines with Updated Weights
Example: Refitting Programs with New Weights

previous

Hierarchical Partitioner Example

next

Refitting TensorRT Engines with Updated Weights

© Copyright 2024, NVIDIA Corporation.