IntXQuantizationAwareTrainingConfig¶

class torchao.quantization.qat.IntXQuantizationAwareTrainingConfig(activation_config: Optional[FakeQuantizeConfigBase] = None, weight_config: Optional[FakeQuantizeConfigBase] = None)[source]¶

(Deprecated) Please use QATConfig instead.

Config for applying fake quantization to a torch.nn.Module. to be used with quantize_().

Example usage:

from torchao.quantization import quantize_
from torchao.quantization.qat import IntxFakeQuantizeConfig
activation_config = IntxFakeQuantizeConfig(
    torch.int8, "per_token", is_symmetric=False,
)
weight_config = IntxFakeQuantizeConfig(
    torch.int4, group_size=32, is_symmetric=True,
)
quantize_(
    model,
    IntXQuantizationAwareTrainingConfig(activation_config, weight_config),
)

Note: If the config is applied on a module that is not torch.nn.Linear or torch.nn.Embedding, or it is applied on torch.nn.Embedding with an activation config, then we will raise ValueError as these are not supported.

IntXQuantizationAwareTrainingConfig¶

Docs

Tutorials

Resources