Distributed Inference#
Examples of multi-GPU distributed inference with Torch-TensorRT, covering data parallelism (running copies of the same model on multiple GPUs) and tensor parallelism (splitting a single large model across multiple GPUs).
sphx_glr_tutorials__rendered_examples_distributed_inference_data_parallel_stable_diffusion.py
sphx_glr_tutorials__rendered_examples_distributed_inference_data_parallel_gpt2.py
sphx_glr_tutorials__rendered_examples_distributed_inference_test_multinode_nccl.py
sphx_glr_tutorials__rendered_examples_distributed_inference_tensor_parallel_simple_example.py
sphx_glr_tutorials__rendered_examples_distributed_inference_tensor_parallel_simple_example_mn.py
sphx_glr_tutorials__rendered_examples_distributed_inference_test_multinode_export_save_load.py