Float8DynamicActivationFloat8SemiSparseWeightConfig#
- class torchao.quantization.Float8DynamicActivationFloat8SemiSparseWeightConfig(layout: Layout = CutlassSemiSparseLayout(), activation_dtype: dtype = torch.float8_e5m2, weight_dtype: dtype = torch.float8_e4m3fn)[source][source]#
Applies float8 dynamic quantization to activations and float8 quantization followed by compression to sparse semi-structured tensor to weights of linear layers.
- Parameters
layout – layout type for quantized weight tensor, only supports CutlassSemiSparseLayout at the moment.
activation_dtype – data type for quantized activation tensor.
weight_dtype – data type for quantized weight tensor.