Shortcuts

Float8WeightOnlyConfig

class torchao.quantization.Float8WeightOnlyConfig(weight_dtype: dtype = torch.float8_e4m3fn, set_inductor_config: bool = True)[source]

Configuration for applying float8 weight-only symmetric per-channel quantization to linear layers.

Parameters:
  • weight_dtype (torch.dtype) – The target data type for weight quantization. Default is torch.float8_e4m3fn.

  • set_inductor_config (bool) – if True, adjusts torchinductor settings to recommended values.

Note

The actual matmul will be computed in original precision of the weight tensor.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources