Rate this Page

Float8DynamicActivationInt4WeightConfig#

class torchao.quantization.Float8DynamicActivationInt4WeightConfig(int4_packing_format: Int4PackingFormat = 'preshuffled')[source][source]#

Configuration for apply float8 dynamic per row quantization and int4 per group weight quantization to linear (only group_size 128 is supported right now since underlying kernel used only supports 128 and above and no benefits of making it bigger)

Parameters

int4_packing_format – how the weight is packed, only preshuffled is supported