float8_static_activation_float8_weight¶

torchao.quantization.float8_static_activation_float8_weight(scale: Tensor, activation_dtype: dtype = torch.float8_e4m3fn, weight_dtype: dtype = torch.float8_e4m3fn, granularity: Optional[Union[PerTensor, PerRow, Tuple[Union[PerTensor, PerRow], Union[PerTensor, PerRow]]]] = None, mm_config: Optional[Float8MMConfig] = None)[source]¶

Applies float8 static symmetric quantization to

Parameters:

scale (torch.Tensor) – The scale tensor for activation quantization.
activation_dtype (torch.dtype) – The target data type for activation quantization. Default is torch.float8_e4m
weight_dtype (torch.dtype) – The target data type for weight quantization. Default is torch.float8_e4m
mm_config (Float8MMConfig) – Configuration for the matrix multiplication. Default uses fast accumulation.

float8_static_activation_float8_weight¶

Docs

Tutorials

Resources