Shortcuts

dequantize_affine

torchao.quantization.dequantize_affine(input: Tensor, block_size: Tuple[int, ...], scale: Tensor, zero_point: Optional[Tensor], input_dtype: dtype, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, *, output_dtype: dtype = torch.float32) Tensor[source]
Parameters:
  • input (torch.Tensor) – quantized tensor, should match the dtype dtype argument

  • block_size – (List[int]): granularity of quantization, this means the size of the tensor elements that’s sharing the same qparam e.g. when size is the same as the input tensor dimension, we are using per tensor quantization

  • scale (Tensor) – quantization parameter for affine quantization

  • zero_point (Tensor) – quantization parameter for affine quantization

  • input_dtype (torch.dtype) – requested dtype (e.g. torch.uint8) for output Tensor

  • quant_min (Optional[int]) – minimum quantized value for input Tensor

  • quant_max (Optional[int]) – maximum quantized value for input Tensor

  • output_dtype (torch.dtype) – dtype for output Tensor, default is fp32

  • domain (Default value for zero_point is in integer) –

  • quantization (zero point is added to the quantized integer value during) –

Output:

dequantized Tensor, with requested dtype or fp32

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources