Shortcuts

PackingFormat

class torchao.quantization.quantize_.common.PackingFormat(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Packing format for quantized data in Tensor subclasses in torchao, represents how the values are packed and laid out in the quantized data.

MARLIN_SPARSE = 'marlin_sparse'

Unpacked means the subbyte quantized data is stored as int8

PLAIN = 'plain'

preshuffled is referring to the preshuffled format used by fbgemm kernels

PRESHUFFLED = 'preshuffled'

marlin_sparse is referring to the format used by marlin kernels, only supports symmetric quantization

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources