MarlinSparseLayout¶
- class torchao.dtypes.MarlinSparseLayout[source]¶
- MarlinSparseLayout is a layout class for handling sparse tensor formats specifically designed for the Marlin sparse kernel. This layout is used to optimize the storage and computation of affine quantized tensors with 2:4 sparsity patterns. - The layout ensures that the tensor data is pre-processed and stored in a format that is compatible with the Marlin sparse kernel operations. It provides methods for preprocessing input tensors and managing the layout of quantized tensors. - pre_process(input: Tensor) Tensor[source]¶
- Preprocess the input tensor to be in the correct format for the Marlin sparse kernel.
- 1º: the input tensor is transposed since the linear layer keeps the weights in a transposed format 
- 2º: tensor is injected with 2:4 sparsity 
- 3º: transposes it again because the quantization process will compute the scales for dim=-1 
 
 - Parameters:
- input (torch.Tensor) – the input tensor to preprocess 
- Returns:
- the preprocessed tensor 
- Return type: