Shortcuts

Int8DynActInt4WeightQATQuantizer

class torchao.quantization.qat.Int8DynActInt4WeightQATQuantizer(groupsize: int = 256, padding_allowed: bool = False, precision: dtype = torch.float32, scales_precision: dtype = torch.float32)[source]

Quantizer for performing QAT on a model, where linear layers have int8 dynamic per token fake quantized activations and int4 fake quantized grouped per channel weights.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources