KernelPreference#
- class torchao.quantization.quantize_.common.KernelPreference(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source][source]#
Enum for specifying the groups of kernels that’s used for quantization, matrix multiplication or other compute ops for quantized tensor
Examples of how options affects the selected kernels can be found in tensor subclass implementations under torchao/quantization/quantize_/workflows
- AUTO = 'auto'#
Use torch native quantize and quantized mm kernels
- FBGEMM = 'fbgemm'#
Emulates gemm_lowp(A, B) with gemm_fp32(A.dequantize(), B.dequantize()). Intended use cases are: 1. Running CI for product logic on hardware which does not support the
actual lowp gemm.
Debugging kernel numerics issues.
- TORCH = 'torch'#
Use quantize and quantized mm kernels from fbgemm_gpu_genai library, requires fbgemm_gpu_genai library