Shortcuts

float8_weight_only

torchao.quantization.float8_weight_only(weight_dtype: dtype = torch.float8_e4m3fn)[source]

Applies float8 weight-only symmetric per-channel quantization to linear layers.

Parameters:

weight_dtype (torch.dtype) – The target data type for weight quantization. Default is torch.float8_e4m3fn.

Note

The actual matmul will be computed in original precision of the weight tensor.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources