Shortcuts

uintx_weight_only

torchao.quantization.uintx_weight_only(dtype, group_size=64, pack_dim=- 1, use_hqq=False)[source]

Applies uintx weight-only asymmetric per-group quantization to linear layers, using uintx quantization where x is the number of bits specified by dtype

Parameters:
  • dtype – torch.uint1 to torch.uint7 sub byte dtypes

  • group_size – parameter for quantization, controls the granularity of quantization, smaller size is more fine grained, defaults to 64

  • pack_dim – the dimension we use for packing, defaults to -1

  • use_hqq – whether to use hqq algorithm or the default algorithm to quantize the weight

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources