Int8WeightOnlyConfig¶
- class torchao.quantization.Int8WeightOnlyConfig(group_size: Optional[int] = None, set_inductor_config: bool = True)[source]¶
Configuration for applying int8 weight-only symmetric per-channel quantization to linear layers.
- Parameters:
group_size – Optional[int] = None - Controls the granularity of quantization. If None, applies per-channel quantization. Otherwise, applies per-group quantization with the specified group size.
set_inductor_config – bool = True - If True, adjusts torchinductor settings to recommended values for better performance with this quantization scheme.