Rate this Page

Int8WeightOnlyConfig#

class torchao.quantization.Int8WeightOnlyConfig(group_size: Optional[int] = None, granularity: Optional[Granularity] = PerRow(dim=-1), set_inductor_config: bool = True, version: int = 1)[source][source]#

Configuration for applying int8 weight-only symmetric per-channel quantization to linear layers.

Parameters
  • group_size (version 1) –

  • None (If) –

  • Otherwise (applies per-channel quantization.) –

  • size. (applies per-group quantization with the specified group) –

  • granularity (version 2) – PerRow() for per-channel quantization, PerTensor() for per-tensor quantization.

  • set_inductor_config – bool = True - If True, adjusts torchinductor settings to recommended values for better performance with this quantization scheme.