Shortcuts

MarlinQQQTensor

class torchao.dtypes.MarlinQQQTensor(tensor_impl: AQTTensorImpl, block_size: Tuple[int, ...], shape: Size, quant_min: Optional[Union[int, float]] = None, quant_max: Optional[Union[int, float]] = None, zero_point_domain: ZeroPointDomain = ZeroPointDomain.INT, dtype=None, strides=None)[source]

MarlinQQQ quantized tensor subclass which inherits AffineQuantizedTensor class.

To see what happens during choose_qparams_and_quantize_affine_qqq, quantization and dequantization for marlin qqq quantization, please checkout https://github.com/pytorch/ao/blob/main/torchao/quantization/quant_primitives.py and check the two quant primitive ops: choose_qparams_and_quantize_affine_qqq and dequantize_affine_qqq

dequantize() Tensor[source]

Given a quantized Tensor, dequantize it and return the dequantized float Tensor.

classmethod from_hp_to_intx(input_float: Tensor, block_size: Tuple[int, ...], quant_min: Optional[int] = None, quant_max: Optional[int] = None, zero_point_domain: ZeroPointDomain = ZeroPointDomain.INT, _layout: Optional[Layout] = None)[source]

Converts a floating point tensor to a Marlin QQQ quantized tensor.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources