Float8ActInt4WeightQATQuantizer¶
- class torchao.quantization.qat.Float8ActInt4WeightQATQuantizer(group_size: Optional[int] = 64, scale_precision: dtype = torch.bfloat16)[source]¶
QAT quantizer for applying dynamic rowwise float8 activation + int4 per group/channel symmetric weight fake quantization to linear layers in the model. Currently only supports rowwise granularity for float8 activations.
- Parameters:
group_size (Optional[int]) – the number of elements in each quantized group for weights, defaults to 64. Use None for per channel.
scale_precision – precision of weight scales, defaults to torch.bfloat16.