KernelPreference#

class torchao.quantization.quantize_.common.KernelPreference(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source][source]#

Enum for specifying the groups of kernels that’s used for quantization, matrix multiplication or other compute ops for quantized tensor

Examples of how options affects the selected kernels can be found in tensor subclass implementations under torchao/quantization/quantize_/workflows

AUTO = 'auto'#: Use torch native quantize and quantized mm kernels

FBGEMM = 'fbgemm'#

Emulates gemm_lowp(A, B) with gemm_fp32(A.dequantize(), B.dequantize()). Intended use cases are: 1. Running CI for product logic on hardware which does not support the

actual lowp gemm.

Debugging kernel numerics issues.

TORCH = 'torch'#: Use quantize and quantized mm kernels from fbgemm_gpu_genai library, requires fbgemm_gpu_genai library

KernelPreference#

Docs

Tutorials

Resources