Distributed Inference#
Examples of multi-GPU distributed inference with Torch-TensorRT, covering data parallelism (running copies of the same model on multiple GPUs) and tensor parallelism (splitting a single large model across multiple GPUs).
sphx_glr_tutorials__rendered_examples_distributed_inference_data_parallel_stable_diffusion.py
Torch-TensorRT Distributed Inference
sphx_glr_tutorials__rendered_examples_distributed_inference_data_parallel_gpt2.py
Torch-TensorRT Distributed Inference
sphx_glr_tutorials__rendered_examples_distributed_inference_tensor_parallel_simple_example.py
Tensor Parallel Distributed Inference with Torch-TensorRT