IntXQuantizationAwareTrainingConfig¶
- class torchao.quantization.qat.IntXQuantizationAwareTrainingConfig(activation_config: Optional[FakeQuantizeConfig] = None, weight_config: Optional[FakeQuantizeConfig] = None)[source]¶
Config for applying fake quantization to a torch.nn.Module. to be used with
quantize_()
.Example usage:
from torchao.quantization import quantize_ from torchao.quantization.qat import FakeQuantizeConfig activation_config = FakeQuantizeConfig( torch.int8, "per_token", is_symmetric=False, ) weight_config = FakeQuantizeConfig( torch.int4, group_size=32, is_symmetric=True, ) quantize_( model, IntXQuantizationAwareTrainingConfig(activation_config, weight_config), )
Note: If the config is applied on a module that is not torch.nn.Linear or torch.nn.Embedding, or it is applied on torch.nn.Embedding with an activation config, then we will raise ValueError as these are not supported.