Shortcuts

UIntXWeightOnlyConfig

class torchao.quantization.UIntXWeightOnlyConfig(dtype: dtype, group_size: int = 64, pack_dim: int = - 1, use_hqq: bool = False, set_inductor_config: bool = True)[source]

Configuration for applying uintx weight-only asymmetric per-group quantization to linear layers, using uintx quantization where x is the number of bits specified by dtype

Parameters:
  • dtype – torch.uint1 to torch.uint7 sub byte dtypes

  • group_size – parameter for quantization, controls the granularity of quantization, smaller size is more fine grained, defaults to 64

  • pack_dim – the dimension we use for packing, defaults to -1

  • use_hqq – whether to use hqq algorithm or the default algorithm to quantize the weight

  • set_inductor_config – if True, adjusts torchinductor settings to recommended values.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources