Quantization#

Created On: Oct 09, 2019 | Last Updated On: May 11, 2026

We are centralizing all quantization related development to torchao, please checkout our new doc page: https://docs.pytorch.org/ao/stable/index.html

Plan for the existing quantization flows:

Eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic): please migrate to use torchao eager mode quantize_ API instead.
FX graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx, torch.ao.quantization.quantize_fx.convert_fx): please migrate to use torchao pt2e quantization API instead (torchao.quantization.pt2e.quantize_pt2e.prepare_pt2e, torchao.quantization.pt2e.quantize_pt2e.convert_pt2e).
pt2e quantization has been migrated to torchao (pytorch/ao); see pytorch/ao#2259 for more details.

We plan to delete torch.ao.quantization in 2.10 if there are no blockers, or in the earliest PyTorch version until all the blockers are cleared.

Quantization API Reference (Kept since APIs are still public)#

The Quantization API Reference contains documentation of quantization APIs, such as quantization passes, quantized tensor operations, and supported quantized modules and functions.

torch.ao.ns.fx.utils.compute_sqnr(x, y)[source]#

torch.ao.ns.fx.utils.compute_normalized_l2_error(x, y)[source]#

torch.ao.ns.fx.utils.compute_cosine_similarity(x, y)[source]#

Quantization#

Quantization API Reference (Kept since APIs are still public)#

Docs

Tutorials

Resources