Shortcuts

Int8DynamicActivationInt4WeightConfig

class torchao.quantization.Int8DynamicActivationInt4WeightConfig(group_size: int = 32, layout: Layout = PlainLayout(), mapping_type: MappingType = MappingType.SYMMETRIC, act_mapping_type: MappingType = MappingType.ASYMMETRIC, set_inductor_config: bool = True)[source]

Configuration for applying int8 dynamic per token asymmetric activation quantization and int4 per group weight symmetric quantization to linear This is used to produce a model for executorch backend, but currently executorch did not support lowering for the quantized model from this flow yet

Parameters:
  • group_size – parameter for quantization, controls the granularity of quantization, smaller size is more fine grained

  • layout – layout type for quantized weight tensor, only supports MarlinQQQLayout() and CutlassInt4PackedLayout() for now

  • mapping_type – quantization type for weight, controls the weight quantization is symmetric or asymmetric

  • act_mapping_type – quantization type for activation, controls the activation quantization is symmetric or asymmetric

  • set_inductor_config – if True, adjusts torchinductor settings to recommended values.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources