UIntXWeightOnlyConfig¶

class torchao.quantization.UIntXWeightOnlyConfig(dtype: dtype, group_size: int = 64, pack_dim: int = - 1, use_hqq: bool = False, set_inductor_config: bool = True)[source]¶

Configuration for applying uintx weight-only asymmetric per-group quantization to linear layers, using uintx quantization where x is the number of bits specified by dtype

Parameters:

dtype – torch.uint1 to torch.uint7 sub byte dtypes
group_size – parameter for quantization, controls the granularity of quantization, smaller size is more fine grained, defaults to 64
pack_dim – the dimension we use for packing, defaults to -1
use_hqq – whether to use hqq algorithm or the default algorithm to quantize the weight
set_inductor_config – if True, adjusts torchinductor settings to recommended values.

UIntXWeightOnlyConfig¶

Docs

Tutorials

Resources