torch.nn.functional¶
Convolution functions¶
conv1d | 
Applies a 1D convolution over an input signal composed of several input planes.  | 
conv2d | 
Applies a 2D convolution over an input image composed of several input planes.  | 
conv3d | 
Applies a 3D convolution over an input image composed of several input planes.  | 
conv_transpose1d | 
Applies a 1D transposed convolution operator over an input signal composed of several input planes, sometimes also called "deconvolution".  | 
conv_transpose2d | 
Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution".  | 
conv_transpose3d | 
Applies a 3D transposed convolution operator over an input image composed of several input planes, sometimes also called "deconvolution"  | 
unfold | 
Extract sliding local blocks from a batched input tensor.  | 
fold | 
Combine an array of sliding local blocks into a large containing tensor.  | 
Pooling functions¶
avg_pool1d | 
Applies a 1D average pooling over an input signal composed of several input planes.  | 
avg_pool2d | 
Applies 2D average-pooling operation in regions by step size steps.  | 
avg_pool3d | 
Applies 3D average-pooling operation in regions by step size steps.  | 
max_pool1d | 
Applies a 1D max pooling over an input signal composed of several input planes.  | 
max_pool2d | 
Applies a 2D max pooling over an input signal composed of several input planes.  | 
max_pool3d | 
Applies a 3D max pooling over an input signal composed of several input planes.  | 
max_unpool1d | 
Compute a partial inverse of   | 
max_unpool2d | 
Compute a partial inverse of   | 
max_unpool3d | 
Compute a partial inverse of   | 
lp_pool1d | 
Apply a 1D power-average pooling over an input signal composed of several input planes.  | 
lp_pool2d | 
Apply a 2D power-average pooling over an input signal composed of several input planes.  | 
lp_pool3d | 
Apply a 3D power-average pooling over an input signal composed of several input planes.  | 
adaptive_max_pool1d | 
Applies a 1D adaptive max pooling over an input signal composed of several input planes.  | 
adaptive_max_pool2d | 
Applies a 2D adaptive max pooling over an input signal composed of several input planes.  | 
adaptive_max_pool3d | 
Applies a 3D adaptive max pooling over an input signal composed of several input planes.  | 
adaptive_avg_pool1d | 
Applies a 1D adaptive average pooling over an input signal composed of several input planes.  | 
adaptive_avg_pool2d | 
Apply a 2D adaptive average pooling over an input signal composed of several input planes.  | 
adaptive_avg_pool3d | 
Apply a 3D adaptive average pooling over an input signal composed of several input planes.  | 
fractional_max_pool2d | 
Applies 2D fractional max pooling over an input signal composed of several input planes.  | 
fractional_max_pool3d | 
Applies 3D fractional max pooling over an input signal composed of several input planes.  | 
Attention Mechanisms¶
The torch.nn.attention.bias module contains attention_biases that are designed to be used with
scaled_dot_product_attention.
scaled_dot_product_attention | 
scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0,  | 
Non-linear activation functions¶
threshold | 
Apply a threshold to each element of the input Tensor.  | 
threshold_ | 
In-place version of   | 
relu | 
Applies the rectified linear unit function element-wise.  | 
relu_ | 
In-place version of   | 
hardtanh | 
Applies the HardTanh function element-wise.  | 
hardtanh_ | 
In-place version of   | 
hardswish | 
Apply hardswish function, element-wise.  | 
relu6 | 
Applies the element-wise function .  | 
elu | 
Apply the Exponential Linear Unit (ELU) function element-wise.  | 
elu_ | 
In-place version of   | 
selu | 
Applies element-wise, , with and .  | 
celu | 
Applies element-wise, .  | 
leaky_relu | 
Applies element-wise,  | 
leaky_relu_ | 
In-place version of   | 
prelu | 
Applies element-wise the function where weight is a learnable parameter.  | 
rrelu | 
Randomized leaky ReLU.  | 
rrelu_ | 
In-place version of   | 
glu | 
The gated linear unit.  | 
gelu | 
When the approximate argument is 'none', it applies element-wise the function  | 
logsigmoid | 
Applies element-wise  | 
hardshrink | 
Applies the hard shrinkage function element-wise  | 
tanhshrink | 
Applies element-wise,  | 
softsign | 
Applies element-wise, the function  | 
softplus | 
Applies element-wise, the function .  | 
softmin | 
Apply a softmin function.  | 
softmax | 
Apply a softmax function.  | 
softshrink | 
Applies the soft shrinkage function elementwise  | 
gumbel_softmax | 
Sample from the Gumbel-Softmax distribution (Link 1 Link 2) and optionally discretize.  | 
log_softmax | 
Apply a softmax followed by a logarithm.  | 
tanh | 
Applies element-wise,  | 
sigmoid | 
Applies the element-wise function  | 
hardsigmoid | 
Apply the Hardsigmoid function element-wise.  | 
silu | 
Apply the Sigmoid Linear Unit (SiLU) function, element-wise.  | 
mish | 
Apply the Mish function, element-wise.  | 
batch_norm | 
Apply Batch Normalization for each channel across a batch of data.  | 
group_norm | 
Apply Group Normalization for last certain number of dimensions.  | 
instance_norm | 
Apply Instance Normalization independently for each channel in every data sample within a batch.  | 
layer_norm | 
Apply Layer Normalization for last certain number of dimensions.  | 
local_response_norm | 
Apply local response normalization over an input signal.  | 
rms_norm | 
Apply Root Mean Square Layer Normalization.  | 
normalize | 
Perform normalization of inputs over specified dimension.  | 
Linear functions¶
linear | 
Applies a linear transformation to the incoming data: .  | 
bilinear | 
Applies a bilinear transformation to the incoming data:  | 
Dropout functions¶
dropout | 
During training, randomly zeroes some elements of the input tensor with probability   | 
alpha_dropout | 
Apply alpha dropout to the input.  | 
feature_alpha_dropout | 
Randomly masks out entire channels (a channel is a feature map).  | 
dropout1d | 
Randomly zero out entire channels (a channel is a 1D feature map).  | 
dropout2d | 
Randomly zero out entire channels (a channel is a 2D feature map).  | 
dropout3d | 
Randomly zero out entire channels (a channel is a 3D feature map).  | 
Sparse functions¶
embedding | 
Generate a simple lookup table that looks up embeddings in a fixed dictionary and size.  | 
embedding_bag | 
Compute sums, means or maxes of bags of embeddings.  | 
one_hot | 
Takes LongTensor with index values of shape   | 
Distance functions¶
pairwise_distance | 
See   | 
cosine_similarity | 
Returns cosine similarity between   | 
pdist | 
Computes the p-norm distance between every pair of row vectors in the input.  | 
Loss functions¶
binary_cross_entropy | 
Measure Binary Cross Entropy between the target and input probabilities.  | 
binary_cross_entropy_with_logits | 
Calculate Binary Cross Entropy between target and input logits.  | 
poisson_nll_loss | 
Poisson negative log likelihood loss.  | 
cosine_embedding_loss | 
See   | 
cross_entropy | 
Compute the cross entropy loss between input logits and target.  | 
ctc_loss | 
Apply the Connectionist Temporal Classification loss.  | 
gaussian_nll_loss | 
Gaussian negative log likelihood loss.  | 
hinge_embedding_loss | 
See   | 
kl_div | 
Compute the KL Divergence loss.  | 
l1_loss | 
Function that takes the mean element-wise absolute value difference.  | 
mse_loss | 
Measures the element-wise mean squared error.  | 
margin_ranking_loss | 
See   | 
multilabel_margin_loss | 
See   | 
multilabel_soft_margin_loss | 
See   | 
multi_margin_loss | 
See   | 
nll_loss | 
Compute the negative log likelihood loss.  | 
huber_loss | 
Compute the Huber loss.  | 
smooth_l1_loss | 
Compute the Smooth L1 loss.  | 
soft_margin_loss | 
See   | 
triplet_margin_loss | 
Compute the triplet loss between given input tensors and a margin greater than 0.  | 
triplet_margin_with_distance_loss | 
Compute the triplet margin loss for input tensors using a custom distance function.  | 
Vision functions¶
pixel_shuffle | 
Rearranges elements in a tensor of shape  to a tensor of shape , where r is the   | 
pixel_unshuffle | 
Reverses the   | 
pad | 
Pads tensor.  | 
interpolate | 
Down/up samples the input.  | 
upsample | 
Upsample input.  | 
upsample_nearest | 
Upsamples the input, using nearest neighbours' pixel values.  | 
upsample_bilinear | 
Upsamples the input, using bilinear upsampling.  | 
grid_sample | 
Compute grid sample.  | 
affine_grid | 
Generate 2D or 3D flow field (sampling grid), given a batch of affine matrices   |