Rate this Page

Float8ActInt4WeightQATQuantizer#

class torchao.quantization.qat.Float8ActInt4WeightQATQuantizer(group_size: Optional[int] = 64, scale_precision: dtype = torch.bfloat16)[source][source]#

QAT quantizer for applying dynamic rowwise float8 activation + int4 per group/channel symmetric weight fake quantization to linear layers in the model. Currently only supports rowwise granularity for float8 activations.

Parameters
  • group_size (Optional[int]) – the number of elements in each quantized group for weights, defaults to 64. Use None for per channel.

  • scale_precision – precision of weight scales, defaults to torch.bfloat16.

prepare(model: Module, *args: Any, **kwargs: Any) Module[source][source]#

Swap all nn.Linear with FakeQuantizedLinear with float8 fake quantizer for activations and int4 fake quantizer for weights.