Operator Support#

This page lists the operators currently supported by the Vulkan backend. The source of truth for this information is op_registry.py, which is used by the Vulkan Partitioner to determine which operators should be lowered to the Vulkan backend and additionally describes the capabilities of each operator implementation.

If an operator used in your model is not in this list, feel free to create a feature request on Github and we will do our best to add an implementation for the operator.

The namespace of an operator describes where it originates from:

aten - operators in this namespace correspond 1:1 to operators in PyTorch’s ATen library. They all support fp16 and fp32 dtypes at a minimum.
dim_order_op - these operators are inserted when lowering to ExecuTorch in order to manage optimal tensor memory layouts. They are typically removed, since the Vulkan backend manages optimal tensor representations internally.
llama - custom ops targeted for LLM inference. These are typically inserted by model source transformations applied to a nn.Module and are not invoked directly by a PyTorch model.
operator - these operators work with symbolic integers, which are also supported by the Vulkan backend.
quantized_decomposed / torchao - these ops are introduced by quantization workflows (either torchao’s quantize_ API or the PT2E quantization flow). They typically represent quantizing/dequantizing a tensor, or choosing the quantization parameters for a tensor. In practice, most instances of these operators will be fused into a custom op in the et_vk namespace.
et_vk - these are custom operators implemented only in the Vulkan backend. They typically represent quantized variants of aten operators, or fusions of common operator patterns. They are inserted by operator fusion graph passes when lowering to the Vulkan backend.

All operators support dynamic input shapes unless otherwise noted (i.e. “no resize support”). The expectation is that over time, all operators will be able to support dynamic shapes.

Vulkan Backend Operator Support#
Namespace	Operator	Notes
aten	_log_softmax
aten	_native_batch_norm_legit_no_training
aten	_softmax
aten	_to_copy	dtype conversion between float types only
aten	_weight_int8pack_mm
aten	abs
aten	add
aten	addmm
aten	amax	keepdim=True required; max 2D reductions
aten	amin	keepdim=True required; max 2D reductions
aten	arange
aten	avg_pool2d
aten	bmm
aten	cat
aten	clamp
aten	clone
aten	constant_pad_nd
aten	convolution	batch=1 for 2D conv; no transposed 1D conv; no 3D conv
aten	cos
aten	div
aten	div.Tensor_mode
aten	embedding
aten	eq
aten	exp
aten	expand_copy	no resize support
aten	flip
aten	full
aten	full_like
aten	ge
aten	gelu
aten	gt
aten	hardshrink
aten	hardtanh
aten	index_select
aten	le
aten	leaky_relu
aten	linear
aten	lt
aten	max_pool2d
aten	max_pool2d_with_indices
aten	mean	keepdim=True required; max 2D reductions
aten	minimum
aten	mm
aten	native_group_norm
aten	native_layer_norm	resize supported
aten	neg
aten	ones
aten	ones_like
aten	permute
aten	permute_copy
aten	pow
aten	relu
aten	repeat
aten	round
aten	rsqrt
aten	scalar_tensor
aten	select_copy
aten	sigmoid
aten	sin
aten	slice_copy
aten	split
aten	split_with_sizes_copy
aten	sqrt
aten	squeeze_copy
aten	sub
aten	sum	keepdim=True required; max 2D reductions
aten	t_copy
aten	tanh
aten	unsqueeze_copy
aten	upsample_bilinear2d
aten	upsample_nearest2d
aten	view_copy
aten	zeros
aten	zeros_like
aten	_assert_scalar	removed via graph pass
aten	sym_constrain_range_for_size	removed via graph pass
aten	sym_size
dim_order_ops	_clone_dim_order	no dtype conversion; removable if no dtype change
dim_order_ops	_to_dim_order_copy	no dtype conversion; removable if no dtype change
llama	custom_sdpa
llama	sdpa_with_kv_cache
llama	update_cache
operator	add
operator	eq
operator	ge
operator	getitem
operator	gt
operator	le
operator	lt
quantized_decomposed	choose_qparams
quantized_decomposed	choose_qparams_per_token_asymmetric
quantized_decomposed	dequantize_per_channel
quantized_decomposed	dequantize_per_tensor
quantized_decomposed	dequantize_per_token
quantized_decomposed	quantize_per_channel
quantized_decomposed	quantize_per_tensor
quantized_decomposed	quantize_per_token
torchao	choose_qparams_affine
torchao	dequantize_affine
torchao	quantize_affine
et_vk	add_q8ta_q8ta_q8to	no resize support
et_vk	apply_rotary_emb
et_vk	conv2d_q8ta_q8csw_q8to	no resize support
et_vk	conv2d_q8ta_q8csw_q8to_dw	no resize support
et_vk	conv_with_clamp	batch=1 for 2D conv; no transposed 1D conv
et_vk	dequantize_q8to_from_conv2d	no resize support
et_vk	grid_priors
et_vk	linear_dq8ca_q4gsw
et_vk	linear_q4gsw
et_vk	linear_q8ta_q8csw
et_vk	linear_qcs4w
et_vk	quantize_q8ta_for_conv2d	no resize support

Operator Support#

Docs

Tutorials

Resources