Int8WeightOnlyConfig#
- class torchao.quantization.Int8WeightOnlyConfig(group_size: Optional[int] = None, granularity: Optional[Granularity] = PerRow(dim=-1), set_inductor_config: bool = True, version: int = 1)[source][source]#
Configuration for applying int8 weight-only symmetric per-channel quantization to linear layers.
- Parameters
group_size (version 1) –
None (If) –
Otherwise (applies per-channel quantization.) –
size. (applies per-group quantization with the specified group) –
granularity (version 2) – PerRow() for per-channel quantization, PerTensor() for per-tensor quantization.
set_inductor_config – bool = True - If True, adjusts torchinductor settings to recommended values for better performance with this quantization scheme.