choose_qparams_affine¶
- torchao.quantization.choose_qparams_affine(input: Tensor, mapping_type: MappingType, block_size: Tuple[int], target_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, eps: Optional[float] = None, scale_dtype: Optional[dtype] = None, zero_point_dtype: Optional[dtype] = torch.int32) Tuple[Tensor, Tensor] [source]¶
- Parameters:
input (torch.Tensor) – fp32, bf16, fp16 input Tensor
mapping_type (MappingType) – determines how the qparams are calculated, symmetric or asymmetric
block_size – (Tuple[int]): granularity of quantization, this means the size of the tensor elements that’s sharing the same qparam e.g. when size is the same as the input tensor dimension, we are using per tensor quantization
target_dtype (torch.dtype) – dtype for target quantized Tensor
quant_min (Optional[int]) – minimum quantized value for target quantized Tensor
quant_max (Optioanl[int]) – maximum quantized value for target quantized Tensor
eps (Optional[float]) – minimum scale, if not provided, default to eps of input.dtype
scale_dtype (torch.dtype) – dtype for scale Tensor
zero_point_dtype (torch.dtype) – dtype for zero_point Tensor, defaults to torch.int32
params (Now removed) – zero_point_domain (ZeroPointDomain): the domain that zero_point is in, defaults to Integer or None preserve_zero (bool): whether to preserve zero in the quantized Tensor, defaults to True
- Output:
Tuple of scales and zero_points Tensor with requested dtype