Shortcuts

choose_qparams_affine

torchao.quantization.choose_qparams_affine(input: Tensor, mapping_type: MappingType, block_size: Tuple[int], target_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, eps: Optional[float] = None, scale_dtype: Optional[dtype] = None, zero_point_dtype: Optional[dtype] = torch.int32) Tuple[Tensor, Tensor][source]
Parameters:
  • input (torch.Tensor) – fp32, bf16, fp16 input Tensor

  • mapping_type (MappingType) – determines how the qparams are calculated, symmetric or asymmetric

  • block_size – (Tuple[int]): granularity of quantization, this means the size of the tensor elements that’s sharing the same qparam e.g. when size is the same as the input tensor dimension, we are using per tensor quantization

  • target_dtype (torch.dtype) – dtype for target quantized Tensor

  • quant_min (Optional[int]) – minimum quantized value for target quantized Tensor

  • quant_max (Optioanl[int]) – maximum quantized value for target quantized Tensor

  • eps (Optional[float]) – minimum scale, if not provided, default to eps of input.dtype

  • scale_dtype (torch.dtype) – dtype for scale Tensor

  • zero_point_dtype (torch.dtype) – dtype for zero_point Tensor, defaults to torch.int32

  • params (Now removed) – zero_point_domain (ZeroPointDomain): the domain that zero_point is in, defaults to Integer or None preserve_zero (bool): whether to preserve zero in the quantized Tensor, defaults to True

Output:

Tuple of scales and zero_points Tensor with requested dtype

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources