Shortcuts

MarlinSparseLayout

class torchao.dtypes.MarlinSparseLayout[source]

MarlinSparseLayout is a layout class for handling sparse tensor formats specifically designed for the Marlin sparse kernel. This layout is used to optimize the storage and computation of affine quantized tensors with 2:4 sparsity patterns.

The layout ensures that the tensor data is pre-processed and stored in a format that is compatible with the Marlin sparse kernel operations. It provides methods for preprocessing input tensors and managing the layout of quantized tensors.

pre_process(input: Tensor) Tensor[source]
Preprocess the input tensor to be in the correct format for the Marlin sparse kernel.
  • 1º: the input tensor is transposed since the linear layer keeps the weights in a transposed format

  • 2º: tensor is injected with 2:4 sparsity

  • 3º: transposes it again because the quantization process will compute the scales for dim=-1

Parameters:

input (torch.Tensor) – the input tensor to preprocess

Returns:

the preprocessed tensor

Return type:

torch.Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources