Welcome to the torchao Documentation ==================================== PyTorch-Native Training-to-Serving Model Optimization ----------------------------------------------------- - Pre-train Llama-3.1-70B **1.5x faster** with float8 training - Recover **67% of quantized accuracy degradation** on Gemma3-4B with QAT - Quantize Llama-3-8B to int4 for **1.89x faster** inference with **58% less memory** `torchao `__ is a library for custom data types and optimizations. Quantize and sparsify weights, gradients, optimizers, and activations for inference and training using native PyTorch. Please checkout torchao `README `__ for an overall introduction to the library and recent highlight and updates. Quick Start ----------- First, install TorchAO. We recommend installing the latest stable version: .. code:: bash pip install torchao Quantize your model weights to int4! .. code:: python import torch from torchao.quantization import Int4WeightOnlyConfig, quantize_ if torch.cuda.is_available(): # quantize on CUDA quantize_(model, Int4WeightOnlyConfig(group_size=32, int4_packing_format="tile_packed_to_4d", int4_choose_qparams_algorithm="hqq")) elif torch.xpu.is_available(): # quantize on XPU quantize_(model, Int4WeightOnlyConfig(group_size=32, int4_packing_format="plain_int32")) See our `first quantization example `__ for more details. Installation ------------ To install the latest stable version: .. code:: bash pip install torchao Other installation options: .. code:: bash # Nightly pip install --pre torchao --index-url https://download.pytorch.org/whl/nightly/cu128 # Different CUDA versions pip install torchao --index-url https://download.pytorch.org/whl/cu126 # CUDA 12.6 pip install torchao --index-url https://download.pytorch.org/whl/cu129 # CUDA 12.9 pip install torchao --index-url https://download.pytorch.org/whl/xpu # XPU pip install torchao --index-url https://download.pytorch.org/whl/cpu # CPU only # For developers # Note: the --no-build-isolation flag is required. USE_CUDA=1 pip install -e . --no-build-isolation USE_XPU=1 pip install -e . --no-build-isolation USE_CPP=0 pip install -e . --no-build-isolation Please see the `torchao compatibility table `__ for version requirements for dependencies. .. toctree:: :glob: :maxdepth: 1 :caption: Workflows workflows/index .. toctree:: :glob: :maxdepth: 1 :caption: API Reference api_reference/index .. toctree:: :glob: :maxdepth: 1 :caption: Tutorials eager_tutorials/index .. toctree:: :glob: :maxdepth: 1 :caption: Contributing contributing/index .. toctree:: :glob: :maxdepth: 1 :caption: PT2E Quantization pt2e_quantization/index