Shortcuts

choose_qparams_affine

torchao.quantization.choose_qparams_affine(input: Tensor, mapping_type: MappingType, block_size: Tuple[int, ...], target_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, eps: Optional[float] = None, scale_dtype: Optional[dtype] = None, zero_point_dtype: Optional[dtype] = None, preserve_zero: bool = True, zero_point_domain: ZeroPointDomain = ZeroPointDomain.INT) Tuple[Tensor, Tensor][source]
Parameters:
  • input (torch.Tensor) – fp32, bf16, fp16 input Tensor

  • mapping_type (MappingType) – determines how the qparams are calculated, symmetric or asymmetric

  • block_size – (Tuple[int, …]): granularity of quantization, this means the size of the tensor elements that’s sharing the same qparam e.g. when size is the same as the input tensor dimension, we are using per tensor quantization

  • target_dtype (torch.dtype) – dtype for target quantized Tensor

  • quant_min (Optional[int]) – minimum quantized value for target quantized Tensor

  • quant_max (Optioanl[int]) – maximum quantized value for target quantized Tensor

  • eps (Optional[float]) – minimum scale, if not provided, default to eps of input.dtype

  • scale_dtype (torch.dtype) – dtype for scale Tensor

  • zero_point_dtype (torch.dtype) – dtype for zero_point Tensor

  • preserve_zero (bool) –

    a flag to indicate whether we need zero to be exactly representable or not, this is typically required for ops that needs zero padding, like convolution it’s less important for ops that doesn’t have zero padding in the op itself, like linear.

    For example, given a floating point Tensor [1.2, 0.1, 3.0, 4.0, 0.4, 0], if preserve_zero is True, we’ll make sure there is a integer value corresponding to the floating point 0, e.g. [-3, -8, 3, 7, -7, -8], 0 will be mapped to -8 without loss. But if preserve_zero is not True, there won’t be such gurantee.

    If we don’t need zero to be exactly representable, we won’t do rounding and clamping for zero_point

  • zero_point_domain (ZeroPointDomain) – the domain that zero_point is in, should be either integer or float if zero_point is in integer domain, zero point is added to the quantized integer value during quantization if zero_point is in floating point domain, zero point is subtracted from the floating point (unquantized) value during quantization default is ZeroPointDomain.INT

Output:

Tuple of scales and zero_points Tensor with requested dtype

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources