Shortcuts

IntXQuantizationAwareTrainingConfig

class torchao.quantization.qat.IntXQuantizationAwareTrainingConfig(activation_config: Optional[FakeQuantizeConfig] = None, weight_config: Optional[FakeQuantizeConfig] = None)[source]

Config for applying fake quantization to a torch.nn.Module. to be used with quantize_().

Example usage:

from torchao.quantization import quantize_
from torchao.quantization.qat import FakeQuantizeConfig
activation_config = FakeQuantizeConfig(
    torch.int8, "per_token", is_symmetric=False,
)
weight_config = FakeQuantizeConfig(
    torch.int4, group_size=32, is_symmetric=True,
)
quantize_(
    model,
    IntXQuantizationAwareTrainingConfig(activation_config, weight_config),
)

Note: If the config is applied on a module that is not torch.nn.Linear or torch.nn.Embedding, or it is applied on torch.nn.Embedding with an activation config, then we will raise ValueError as these are not supported.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources