Rate this Page

Aliases in torch.nn#

Created On: Jul 25, 2025 | Last Updated On: Jul 25, 2025

The following are aliases to their counterparts in torch.nn in nested namespaces.

torch.nn.modules#

The following are aliases to their counterparts in torch.nn in the torch.nn.modules namespace.

Containers (Aliases)#

container.Sequential

A sequential container.

container.ModuleList

Holds submodules in a list.

container.ModuleDict

Holds submodules in a dictionary.

container.ParameterList

Holds parameters in a list.

container.ParameterDict

Holds parameters in a dictionary.

Convolution Layers (Aliases)#

conv.Conv1d

Applies a 1D convolution over an input signal composed of several input planes.

conv.Conv2d

Applies a 2D convolution over an input signal composed of several input planes.

conv.Conv3d

Applies a 3D convolution over an input signal composed of several input planes.

conv.ConvTranspose1d

Applies a 1D transposed convolution operator over an input image composed of several input planes.

conv.ConvTranspose2d

Applies a 2D transposed convolution operator over an input image composed of several input planes.

conv.ConvTranspose3d

Applies a 3D transposed convolution operator over an input image composed of several input planes.

conv.LazyConv1d

A torch.nn.Conv1d module with lazy initialization of the in_channels argument.

conv.LazyConv2d

A torch.nn.Conv2d module with lazy initialization of the in_channels argument.

conv.LazyConv3d

A torch.nn.Conv3d module with lazy initialization of the in_channels argument.

conv.LazyConvTranspose1d

A torch.nn.ConvTranspose1d module with lazy initialization of the in_channels argument.

conv.LazyConvTranspose2d

A torch.nn.ConvTranspose2d module with lazy initialization of the in_channels argument.

conv.LazyConvTranspose3d

A torch.nn.ConvTranspose3d module with lazy initialization of the in_channels argument.

fold.Unfold

Extracts sliding local blocks from a batched input tensor.

fold.Fold

Combines an array of sliding local blocks into a large containing tensor.

Pooling layers (Aliases)#

pooling.MaxPool1d

Applies a 1D max pooling over an input signal composed of several input planes.

pooling.MaxPool2d

Applies a 2D max pooling over an input signal composed of several input planes.

pooling.MaxPool3d

Applies a 3D max pooling over an input signal composed of several input planes.

pooling.MaxUnpool1d

Computes a partial inverse of MaxPool1d.

pooling.MaxUnpool2d

Computes a partial inverse of MaxPool2d.

pooling.MaxUnpool3d

Computes a partial inverse of MaxPool3d.

pooling.AvgPool1d

Applies a 1D average pooling over an input signal composed of several input planes.

pooling.AvgPool2d

Applies a 2D average pooling over an input signal composed of several input planes.

pooling.AvgPool3d

Applies a 3D average pooling over an input signal composed of several input planes.

pooling.FractionalMaxPool2d

Applies a 2D fractional max pooling over an input signal composed of several input planes.

pooling.FractionalMaxPool3d

Applies a 3D fractional max pooling over an input signal composed of several input planes.

pooling.LPPool1d

Applies a 1D power-average pooling over an input signal composed of several input planes.

pooling.LPPool2d

Applies a 2D power-average pooling over an input signal composed of several input planes.

pooling.LPPool3d

Applies a 3D power-average pooling over an input signal composed of several input planes.

pooling.AdaptiveMaxPool1d

Applies a 1D adaptive max pooling over an input signal composed of several input planes.

pooling.AdaptiveMaxPool2d

Applies a 2D adaptive max pooling over an input signal composed of several input planes.

pooling.AdaptiveMaxPool3d

Applies a 3D adaptive max pooling over an input signal composed of several input planes.

pooling.AdaptiveAvgPool1d

Applies a 1D adaptive average pooling over an input signal composed of several input planes.

pooling.AdaptiveAvgPool2d

Applies a 2D adaptive average pooling over an input signal composed of several input planes.

pooling.AdaptiveAvgPool3d

Applies a 3D adaptive average pooling over an input signal composed of several input planes.

Padding Layers (Aliases)#

padding.ReflectionPad1d

Pads the input tensor using the reflection of the input boundary.

padding.ReflectionPad2d

Pads the input tensor using the reflection of the input boundary.

padding.ReflectionPad3d

Pads the input tensor using the reflection of the input boundary.

padding.ReplicationPad1d

Pads the input tensor using replication of the input boundary.

padding.ReplicationPad2d

Pads the input tensor using replication of the input boundary.

padding.ReplicationPad3d

Pads the input tensor using replication of the input boundary.

padding.ZeroPad1d

Pads the input tensor boundaries with zero.

padding.ZeroPad2d

Pads the input tensor boundaries with zero.

padding.ZeroPad3d

Pads the input tensor boundaries with zero.

padding.ConstantPad1d

Pads the input tensor boundaries with a constant value.

padding.ConstantPad2d

Pads the input tensor boundaries with a constant value.

padding.ConstantPad3d

Pads the input tensor boundaries with a constant value.

padding.CircularPad1d

Pads the input tensor using circular padding of the input boundary.

padding.CircularPad2d

Pads the input tensor using circular padding of the input boundary.

padding.CircularPad3d

Pads the input tensor using circular padding of the input boundary.

Non-linear Activations (weighted sum, nonlinearity) (Aliases)#

activation.ELU

Applies the Exponential Linear Unit (ELU) function, element-wise.

activation.Hardshrink

Applies the Hard Shrinkage (Hardshrink) function element-wise.

activation.Hardsigmoid

Applies the Hardsigmoid function element-wise.

activation.Hardtanh

Applies the HardTanh function element-wise.

activation.Hardswish

Applies the Hardswish function, element-wise.

activation.LeakyReLU

Applies the LeakyReLU function element-wise.

activation.LogSigmoid

Applies the Logsigmoid function element-wise.

activation.MultiheadAttention

Allows the model to jointly attend to information from different representation subspaces.

activation.PReLU

Applies the element-wise PReLU function.

activation.ReLU

Applies the rectified linear unit function element-wise.

activation.ReLU6

Applies the ReLU6 function element-wise.

activation.RReLU

Applies the randomized leaky rectified linear unit function, element-wise.

activation.SELU

Applies the SELU function element-wise.

activation.CELU

Applies the CELU function element-wise.

activation.GELU

Applies the Gaussian Error Linear Units function.

activation.Sigmoid

Applies the Sigmoid function element-wise.

activation.SiLU

Applies the Sigmoid Linear Unit (SiLU) function, element-wise.

activation.Mish

Applies the Mish function, element-wise.

activation.Softplus

Applies the Softplus function element-wise.

activation.Softshrink

Applies the soft shrinkage function element-wise.

activation.Softsign

Applies the element-wise Softsign function.

activation.Tanh

Applies the Hyperbolic Tangent (Tanh) function element-wise.

activation.Tanhshrink

Applies the element-wise Tanhshrink function.

activation.Threshold

Thresholds each element of the input Tensor.

activation.GLU

Applies the gated linear unit function.

Non-linear Activations (other) (Aliases)#

activation.Softmin

Applies the Softmin function to an n-dimensional input Tensor.

activation.Softmax

Applies the Softmax function to an n-dimensional input Tensor.

activation.Softmax2d

Applies SoftMax over features to each spatial location.

activation.LogSoftmax

Applies the log(Softmax(x))\log(\text{Softmax}(x)) function to an n-dimensional input Tensor.

adaptive.AdaptiveLogSoftmaxWithLoss

Efficient softmax approximation.

Normalization Layers (Aliases)#

batchnorm.BatchNorm1d

Applies Batch Normalization over a 2D or 3D input.

batchnorm.BatchNorm2d

Applies Batch Normalization over a 4D input.

batchnorm.BatchNorm3d

Applies Batch Normalization over a 5D input.

batchnorm.LazyBatchNorm1d

A torch.nn.BatchNorm1d module with lazy initialization.

batchnorm.LazyBatchNorm2d

A torch.nn.BatchNorm2d module with lazy initialization.

batchnorm.LazyBatchNorm3d

A torch.nn.BatchNorm3d module with lazy initialization.

normalization.GroupNorm

Applies Group Normalization over a mini-batch of inputs.

batchnorm.SyncBatchNorm

Applies Batch Normalization over a N-Dimensional input.

instancenorm.InstanceNorm1d

Applies Instance Normalization.

instancenorm.InstanceNorm2d

Applies Instance Normalization.

instancenorm.InstanceNorm3d

Applies Instance Normalization.

instancenorm.LazyInstanceNorm1d

A torch.nn.InstanceNorm1d module with lazy initialization of the num_features argument.

instancenorm.LazyInstanceNorm2d

A torch.nn.InstanceNorm2d module with lazy initialization of the num_features argument.

instancenorm.LazyInstanceNorm3d

A torch.nn.InstanceNorm3d module with lazy initialization of the num_features argument.

normalization.LayerNorm

Applies Layer Normalization over a mini-batch of inputs.

normalization.LocalResponseNorm

Applies local response normalization over an input signal.

normalization.RMSNorm

Applies Root Mean Square Layer Normalization over a mini-batch of inputs.

Recurrent Layers (Aliases)#

rnn.RNNBase

Base class for RNN modules (RNN, LSTM, GRU).

rnn.RNN

Apply a multi-layer Elman RNN with tanh\tanh or ReLU\text{ReLU} non-linearity to an input sequence.

rnn.LSTM

Apply a multi-layer long short-term memory (LSTM) RNN to an input sequence.

rnn.GRU

Apply a multi-layer gated recurrent unit (GRU) RNN to an input sequence.

rnn.RNNCell

An Elman RNN cell with tanh or ReLU non-linearity.

rnn.LSTMCell

A long short-term memory (LSTM) cell.

rnn.GRUCell

A gated recurrent unit (GRU) cell.

Transformer Layers (Aliases)#

transformer.Transformer

A basic transformer layer.

transformer.TransformerEncoder

TransformerEncoder is a stack of N encoder layers.

transformer.TransformerDecoder

TransformerDecoder is a stack of N decoder layers.

transformer.TransformerEncoderLayer

TransformerEncoderLayer is made up of self-attn and feedforward network.

transformer.TransformerDecoderLayer

TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.

Linear Layers (Aliases)#

linear.Identity

A placeholder identity operator that is argument-insensitive.

linear.Linear

Applies an affine linear transformation to the incoming data: y=xAT+by = xA^T + b.

linear.Bilinear

Applies a bilinear transformation to the incoming data: y=x1TAx2+by = x_1^T A x_2 + b.

linear.LazyLinear

A torch.nn.Linear module where in_features is inferred.

Dropout Layers (Aliases)#

dropout.Dropout

During training, randomly zeroes some of the elements of the input tensor with probability p.

dropout.Dropout1d

Randomly zero out entire channels.

dropout.Dropout2d

Randomly zero out entire channels.

dropout.Dropout3d

Randomly zero out entire channels.

dropout.AlphaDropout

Applies Alpha Dropout over the input.

dropout.FeatureAlphaDropout

Randomly masks out entire channels.

Sparse Layers (Aliases)#

sparse.Embedding

A simple lookup table that stores embeddings of a fixed dictionary and size.

sparse.EmbeddingBag

Compute sums or means of 'bags' of embeddings, without instantiating the intermediate embeddings.

Distance Functions (Aliases)#

distance.CosineSimilarity

Returns cosine similarity between x1x_1 and x2x_2, computed along dim.

distance.PairwiseDistance

Computes the pairwise distance between input vectors, or between columns of input matrices.

Loss Functions (Aliases)#

loss.L1Loss

Creates a criterion that measures the mean absolute error (MAE) between each element in the input xx and target yy.

loss.MSELoss

Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input xx and target yy.

loss.CrossEntropyLoss

This criterion computes the cross entropy loss between input logits and target.

loss.CTCLoss

The Connectionist Temporal Classification loss.

loss.NLLLoss

The negative log likelihood loss.

loss.PoissonNLLLoss

Negative log likelihood loss with Poisson distribution of target.

loss.GaussianNLLLoss

Gaussian negative log likelihood loss.

loss.KLDivLoss

The Kullback-Leibler divergence loss.

loss.BCELoss

Creates a criterion that measures the Binary Cross Entropy between the target and the input probabilities:

loss.BCEWithLogitsLoss

This loss combines a Sigmoid layer and the BCELoss in one single class.

loss.MarginRankingLoss

Creates a criterion that measures the loss given inputs x1x1, x2x2, two 1D mini-batch or 0D Tensors, and a label 1D mini-batch or 0D Tensor yy (containing 1 or -1).

loss.HingeEmbeddingLoss

Measures the loss given an input tensor xx and a labels tensor yy (containing 1 or -1).

loss.MultiLabelMarginLoss

Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 2D Tensor of target class indices).

loss.HuberLoss

Creates a criterion that uses a squared term if the absolute element-wise error falls below delta and a delta-scaled L1 term otherwise.

loss.SmoothL1Loss

Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.

loss.SoftMarginLoss

Creates a criterion that optimizes a two-class classification logistic loss between input tensor xx and target tensor yy (containing 1 or -1).

loss.MultiLabelSoftMarginLoss

Creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input xx and target yy of size (N,C)(N, C).

loss.CosineEmbeddingLoss

Creates a criterion that measures the loss given input tensors x1x_1, x2x_2 and a Tensor label yy with values 1 or -1.

loss.MultiMarginLoss

Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input xx (a 2D mini-batch Tensor) and output yy (which is a 1D tensor of target class indices, 0yx.size(1)10 \leq y \leq \text{x.size}(1)-1):

loss.TripletMarginLoss

Creates a criterion that measures the triplet loss given an input tensors x1x1, x2x2, x3x3 and a margin with a value greater than 00.

loss.TripletMarginWithDistanceLoss

Creates a criterion that measures the triplet loss given input tensors aa, pp, and nn (representing anchor, positive, and negative examples, respectively), and a nonnegative, real-valued function ("distance function") used to compute the relationship between the anchor and positive example ("positive distance") and the anchor and negative example ("negative distance").

Vision Layers (Aliases)#

pixelshuffle.PixelShuffle

Rearrange elements in a tensor according to an upscaling factor.

pixelshuffle.PixelUnshuffle

Reverse the PixelShuffle operation.

upsampling.Upsample

Upsamples a given multi-channel 1D (temporal), 2D (spatial) or 3D (volumetric) data.

upsampling.UpsamplingNearest2d

Applies a 2D nearest neighbor upsampling to an input signal composed of several input channels.

upsampling.UpsamplingBilinear2d

Applies a 2D bilinear upsampling to an input signal composed of several input channels.

Shuffle Layers (Aliases)#

channelshuffle.ChannelShuffle

Divides and rearranges the channels in a tensor.

torch.nn.utils#

The following are aliases to their counterparts in torch.nn.utils in nested namespaces.

Utility functions to clip parameter gradients.

clip_grad.clip_grad_norm_

Clip the gradient norm of an iterable of parameters.

clip_grad.clip_grad_norm

Clip the gradient norm of an iterable of parameters.

clip_grad.clip_grad_value_

Clip the gradients of an iterable of parameters at specified value.

Utility functions to flatten and unflatten Module parameters to and from a single vector.

convert_parameters.parameters_to_vector

Flatten an iterable of parameters into a single vector.

convert_parameters.vector_to_parameters

Copy slices of a vector into an iterable of parameters.

Utility functions to fuse Modules with BatchNorm modules.

fusion.fuse_conv_bn_eval

Fuse a convolutional module and a BatchNorm module into a single, new convolutional module.

fusion.fuse_conv_bn_weights

Fuse convolutional module parameters and BatchNorm module parameters into new convolutional module parameters.

fusion.fuse_linear_bn_eval

Fuse a linear module and a BatchNorm module into a single, new linear module.

fusion.fuse_linear_bn_weights

Fuse linear module parameters and BatchNorm module parameters into new linear module parameters.

Utility functions to convert Module parameter memory formats.

memory_format.convert_conv2d_weight_memory_format

Convert memory_format of nn.Conv2d.weight to memory_format.

memory_format.convert_conv3d_weight_memory_format

Convert memory_format of nn.Conv3d.weight to memory_format The conversion recursively applies to nested nn.Module, including module.

Utility functions to apply and remove weight normalization from Module parameters.

weight_norm.weight_norm

Apply weight normalization to a parameter in the given module.

weight_norm.remove_weight_norm

Remove the weight normalization reparameterization from a module.

spectral_norm.spectral_norm

Apply spectral normalization to a parameter in the given module.

spectral_norm.remove_spectral_norm

Remove the spectral normalization reparameterization from a module.

Utility functions for initializing Module parameters.

init.skip_init

Given a module class object and args / kwargs, instantiate the module without initializing parameters / buffers.