Rate this Page

Operator Support#

This page lists the operators currently supported by the Vulkan backend. The source of truth for this information is op_registry.py, which is used by the Vulkan Partitioner to determine which operators should be lowered to the Vulkan backend and additionally describes the capabilities of each operator implementation.

If an operator used in your model is not in this list, feel free to create a feature request on Github and we will do our best to add an implementation for the operator.

The namespace of an operator describes where it originates from:

  • aten - operators in this namespace correspond 1:1 to operators in PyTorch’s ATen library. They all support fp16 and fp32 dtypes at a minimum.

  • dim_order_op - these operators are inserted when lowering to ExecuTorch in order to manage optimal tensor memory layouts. They are typically removed, since the Vulkan backend manages optimal tensor representations internally.

  • llama - custom ops targeted for LLM inference. These are typically inserted by model source transformations applied to a nn.Module and are not invoked directly by a PyTorch model.

  • operator - these operators work with symbolic integers, which are also supported by the Vulkan backend.

  • quantized_decomposed / torchao - these ops are introduced by quantization workflows (either torchao’s quantize_ API or the PT2E quantization flow). They typically represent quantizing/dequantizing a tensor, or choosing the quantization parameters for a tensor. In practice, most instances of these operators will be fused into a custom op in the et_vk namespace.

  • et_vk - these are custom operators implemented only in the Vulkan backend. They typically represent quantized variants of aten operators, or fusions of common operator patterns. They are inserted by operator fusion graph passes when lowering to the Vulkan backend.

All operators support dynamic input shapes unless otherwise noted (i.e. “no resize support”). The expectation is that over time, all operators will be able to support dynamic shapes.

Vulkan Backend Operator Support#

Namespace

Operator

Notes

aten

_log_softmax

aten

_native_batch_norm_legit_no_training

aten

_softmax

aten

_to_copy

dtype conversion between float types only

aten

_weight_int8pack_mm

aten

abs

aten

add

aten

addmm

aten

amax

keepdim=True required; max 2D reductions

aten

amin

keepdim=True required; max 2D reductions

aten

arange

aten

avg_pool2d

aten

bmm

aten

cat

aten

clamp

aten

clone

aten

constant_pad_nd

aten

convolution

batch=1 for 2D conv; no transposed 1D conv; no 3D conv

aten

cos

aten

div

aten

div.Tensor_mode

aten

embedding

aten

eq

aten

exp

aten

expand_copy

no resize support

aten

flip

aten

full

aten

full_like

aten

ge

aten

gelu

aten

gt

aten

hardshrink

aten

hardtanh

aten

index_select

aten

le

aten

leaky_relu

aten

linear

aten

lt

aten

max_pool2d

aten

max_pool2d_with_indices

aten

mean

keepdim=True required; max 2D reductions

aten

minimum

aten

mm

aten

native_group_norm

aten

native_layer_norm

resize supported

aten

neg

aten

ones

aten

ones_like

aten

permute

aten

permute_copy

aten

pow

aten

relu

aten

repeat

aten

round

aten

rsqrt

aten

scalar_tensor

aten

select_copy

aten

sigmoid

aten

sin

aten

slice_copy

aten

split

aten

split_with_sizes_copy

aten

sqrt

aten

squeeze_copy

aten

sub

aten

sum

keepdim=True required; max 2D reductions

aten

t_copy

aten

tanh

aten

unsqueeze_copy

aten

upsample_bilinear2d

aten

upsample_nearest2d

aten

view_copy

aten

zeros

aten

zeros_like

aten

_assert_scalar

removed via graph pass

aten

sym_constrain_range_for_size

removed via graph pass

aten

sym_size

dim_order_ops

_clone_dim_order

no dtype conversion; removable if no dtype change

dim_order_ops

_to_dim_order_copy

no dtype conversion; removable if no dtype change

llama

custom_sdpa

llama

sdpa_with_kv_cache

llama

update_cache

operator

add

operator

eq

operator

ge

operator

getitem

operator

gt

operator

le

operator

lt

quantized_decomposed

choose_qparams

quantized_decomposed

choose_qparams_per_token_asymmetric

quantized_decomposed

dequantize_per_channel

quantized_decomposed

dequantize_per_tensor

quantized_decomposed

dequantize_per_token

quantized_decomposed

quantize_per_channel

quantized_decomposed

quantize_per_tensor

quantized_decomposed

quantize_per_token

torchao

choose_qparams_affine

torchao

dequantize_affine

torchao

quantize_affine

et_vk

add_q8ta_q8ta_q8to

no resize support

et_vk

apply_rotary_emb

et_vk

conv2d_q8ta_q8csw_q8to

no resize support

et_vk

conv2d_q8ta_q8csw_q8to_dw

no resize support

et_vk

conv_with_clamp

batch=1 for 2D conv; no transposed 1D conv

et_vk

dequantize_q8to_from_conv2d

no resize support

et_vk

grid_priors

et_vk

linear_dq8ca_q4gsw

et_vk

linear_q4gsw

et_vk

linear_q8ta_q8csw

et_vk

linear_qcs4w

et_vk

quantize_q8ta_for_conv2d

no resize support