dequantize_affine¶

torchao.quantization.dequantize_affine(input: Tensor, block_size: Tuple[int, ...], scale: Tensor, zero_point: Optional[Tensor], input_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, *, output_dtype: dtype = torch.float32) → Tensor[source]¶

Parameters:

input (torch.Tensor) – quantized tensor, should match the dtype dtype argument
block_size – (List[int]): granularity of quantization, this means the size of the tensor elements that’s sharing the same qparam e.g. when size is the same as the input tensor dimension, we are using per tensor quantization
scale (Tensor) – quantization parameter for affine quantization
zero_point (Tensor) – quantization parameter for affine quantization
input_dtype (torch.dtype) – requested dtype (e.g. torch.uint8) for output Tensor
quant_min (Optional[int]) – minimum quantized value for input Tensor
quant_max (Optional[int]) – maximum quantized value for input Tensor
output_dtype (torch.dtype) – dtype for output Tensor, default is fp32
domain (Default value for zero_point is in integer) –
quantization (zero point is added to the quantized integer value during) –

Docs