FakeQuantizedLinear¶
- class torchao.quantization.qat.FakeQuantizedLinear(in_features: int, out_features: int, bias: bool = False, activation_config: Optional[FakeQuantizeConfigBase] = None, weight_config: Optional[FakeQuantizeConfigBase] = None, *args, **kwargs)[source]¶
General linear layer with fake quantized weights and activations.
Specific target dtypes, granularity, schemes etc. are specified through separate configs for weights and activations.
Example usage:
activation_config = IntxFakeQuantizeConfig( dtype=torch.int8, granularity="per_token", is_symmetric=False, ) weight_config = IntxFakeQuantizeConfig( dtype=torch.int4, group_size=8, is_symmetric=True, ) fq_linear = FakeQuantizedLinear( 16, 32, False, activation_config, weight_config, ) fq_linear(torch.randn(16))