Rate this Page

Float8DynamicActivationFloat8SemiSparseWeightConfig#

class torchao.quantization.Float8DynamicActivationFloat8SemiSparseWeightConfig(layout: Layout = CutlassSemiSparseLayout(), activation_dtype: dtype = torch.float8_e5m2, weight_dtype: dtype = torch.float8_e4m3fn)[source][source]#

Applies float8 dynamic quantization to activations and float8 quantization followed by compression to sparse semi-structured tensor to weights of linear layers.

Parameters
  • layout – layout type for quantized weight tensor, only supports CutlassSemiSparseLayout at the moment.

  • activation_dtype – data type for quantized activation tensor.

  • weight_dtype – data type for quantized weight tensor.