Float8DynamicActivationInt4WeightConfig¶
- class torchao.quantization.Float8DynamicActivationInt4WeightConfig(packing_format: PackingFormat = 'preshuffled')[source]¶
Configuration for apply float8 dynamic per row quantization and int4 per group weight quantization to linear (only group_size 128 is supported right now since underlying kernel used only supports 128 and above and no benefits of making it bigger)
- Parameters:
packing_format – how the weight is packed, only preshuffled is supported