Rate this Page

dequantize_affine#

torchao.quantization.dequantize_affine(input: Tensor, block_size: Tuple[int, ...], scale: Tensor, zero_point: Optional[Tensor], input_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, *, output_dtype: dtype = torch.float32) Tensor[source][source]#
Parameters
  • input (torch.Tensor) – quantized tensor, should match the dtype dtype argument

  • block_size – (List[int]): granularity of quantization, this means the size of the tensor elements that’s sharing the same qparam e.g. when size is the same as the input tensor dimension, we are using per tensor quantization

  • scale (Tensor) – quantization parameter for affine quantization

  • zero_point (Tensor) – quantization parameter for affine quantization

  • input_dtype (torch.dtype) – requested dtype (e.g. torch.uint8) for output Tensor

  • quant_min (Optional[int]) – minimum quantized value for input Tensor

  • quant_max (Optional[int]) – maximum quantized value for input Tensor

  • output_dtype (torch.dtype) – dtype for output Tensor, default is fp32

  • domain (Default value for zero_point is in integer) –

  • quantization (zero point is added to the quantized integer value during) –

Output:

dequantized Tensor, with requested dtype or fp32