Shortcuts

sparsify

torchao.sparsity.sparsify_(model: Module, config: AOBaseConfig, filter_fn: Optional[Callable[[Module, str], bool]] = None) Module[source]

Convert the weight of linear modules in the model with apply_tensor_subclass. This function is essentially the same as quantize, put for sparsity subclasses.

Currently, we support three options for sparsity:
  • semi-structured (2:4) sparsity with semi_sparse_weight

  • int8 dynamic quantization + 2:4 sparsity with layout=SemiSparseLayout

  • int4 weight-only quantization + 2:4 sparsity with layout=SparseMarlinLayout

Parameters:
  • model (torch.nn.Module) – input model

  • config (AOBaseConfig) – a workflow configuration object

  • filter_fn (Optional[Callable[[torch.nn.Module, str], bool]]) – function that takes a nn.Module instance and fully qualified name of the module, returns True if we want to apply the specified workflow to this module.

Example:

import torch
import torch.nn as nn
from torchao.sparsity import sparsify_

def filter_fn(module: nn.Module, fqn: str) -> bool:
    return isinstance(module, nn.Linear)

m = nn.Sequential(nn.Linear(32, 1024), nn.Linear(1024, 32))

# for 2:4 sparsity
from torchao.sparse_api import semi_sparse_weight
m = sparsify_(m, semi_sparse_weight(), filter_fn)

# for int8 dynamic quantization + 2:4 sparsity
from torchao.dtypes import SemiSparseLayout
m = quantize_(m, int8_dynamic_activation_int8_weight(layout=SemiSparseLayout), filter_fn)

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources