Shortcuts

fake_quantize_affine_cachemask

torchao.quantization.fake_quantize_affine_cachemask(input: Tensor, block_size: Tuple[int, ...], scale: Tensor, zero_point: Optional[Tensor], quant_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, zero_point_domain: ZeroPointDomain = ZeroPointDomain.INT) Tuple[Tensor, Tensor][source]

General fake quantize op for quantization-aware training (QAT). This is equivalent to calling quantize_affine + dequantize_affine but without the dtype casts.

Note: Compared to fake_quantize_affine(), this consumes more memory and returns an additional outlier mask for intermediate quantized values.

:param Same as fake_quantize_affine().:

Returns:

A 2-tuple of (

final fake quantized values, outlier mask for intermediate quantized values

)

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources