Rate this Page

★ ★ ★ ★ ★

Aliases in torch.nn#

Created On: Jul 25, 2025 | Last Updated On: Jul 25, 2025

The following are aliases to their counterparts in torch.nn in nested namespaces.

torch.nn.modules#

The following are aliases to their counterparts in torch.nn in the torch.nn.modules namespace.

Containers (Aliases)#

`container.Sequential`	A sequential container.
`container.ModuleList`	Holds submodules in a list.
`container.ModuleDict`	Holds submodules in a dictionary.
`container.ParameterList`	Holds parameters in a list.
`container.ParameterDict`	Holds parameters in a dictionary.

Convolution Layers (Aliases)#

`conv.Conv1d`	Applies a 1D convolution over an input signal composed of several input planes.
`conv.Conv2d`	Applies a 2D convolution over an input signal composed of several input planes.
`conv.Conv3d`	Applies a 3D convolution over an input signal composed of several input planes.
`conv.ConvTranspose1d`	Applies a 1D transposed convolution operator over an input image composed of several input planes.
`conv.ConvTranspose2d`	Applies a 2D transposed convolution operator over an input image composed of several input planes.
`conv.ConvTranspose3d`	Applies a 3D transposed convolution operator over an input image composed of several input planes.
`conv.LazyConv1d`	A `torch.nn.Conv1d` module with lazy initialization of the `in_channels` argument.
`conv.LazyConv2d`	A `torch.nn.Conv2d` module with lazy initialization of the `in_channels` argument.
`conv.LazyConv3d`	A `torch.nn.Conv3d` module with lazy initialization of the `in_channels` argument.
`conv.LazyConvTranspose1d`	A `torch.nn.ConvTranspose1d` module with lazy initialization of the `in_channels` argument.
`conv.LazyConvTranspose2d`	A `torch.nn.ConvTranspose2d` module with lazy initialization of the `in_channels` argument.
`conv.LazyConvTranspose3d`	A `torch.nn.ConvTranspose3d` module with lazy initialization of the `in_channels` argument.
`fold.Unfold`	Extracts sliding local blocks from a batched input tensor.
`fold.Fold`	Combines an array of sliding local blocks into a large containing tensor.

Pooling layers (Aliases)#

`pooling.MaxPool1d`	Applies a 1D max pooling over an input signal composed of several input planes.
`pooling.MaxPool2d`	Applies a 2D max pooling over an input signal composed of several input planes.
`pooling.MaxPool3d`	Applies a 3D max pooling over an input signal composed of several input planes.
`pooling.MaxUnpool1d`	Computes a partial inverse of `MaxPool1d`.
`pooling.MaxUnpool2d`	Computes a partial inverse of `MaxPool2d`.
`pooling.MaxUnpool3d`	Computes a partial inverse of `MaxPool3d`.
`pooling.AvgPool1d`	Applies a 1D average pooling over an input signal composed of several input planes.
`pooling.AvgPool2d`	Applies a 2D average pooling over an input signal composed of several input planes.
`pooling.AvgPool3d`	Applies a 3D average pooling over an input signal composed of several input planes.
`pooling.FractionalMaxPool2d`	Applies a 2D fractional max pooling over an input signal composed of several input planes.
`pooling.FractionalMaxPool3d`	Applies a 3D fractional max pooling over an input signal composed of several input planes.
`pooling.LPPool1d`	Applies a 1D power-average pooling over an input signal composed of several input planes.
`pooling.LPPool2d`	Applies a 2D power-average pooling over an input signal composed of several input planes.
`pooling.LPPool3d`	Applies a 3D power-average pooling over an input signal composed of several input planes.
`pooling.AdaptiveMaxPool1d`	Applies a 1D adaptive max pooling over an input signal composed of several input planes.
`pooling.AdaptiveMaxPool2d`	Applies a 2D adaptive max pooling over an input signal composed of several input planes.
`pooling.AdaptiveMaxPool3d`	Applies a 3D adaptive max pooling over an input signal composed of several input planes.
`pooling.AdaptiveAvgPool1d`	Applies a 1D adaptive average pooling over an input signal composed of several input planes.
`pooling.AdaptiveAvgPool2d`	Applies a 2D adaptive average pooling over an input signal composed of several input planes.
`pooling.AdaptiveAvgPool3d`	Applies a 3D adaptive average pooling over an input signal composed of several input planes.

Padding Layers (Aliases)#

`padding.ReflectionPad1d`	Pads the input tensor using the reflection of the input boundary.
`padding.ReflectionPad2d`	Pads the input tensor using the reflection of the input boundary.
`padding.ReflectionPad3d`	Pads the input tensor using the reflection of the input boundary.
`padding.ReplicationPad1d`	Pads the input tensor using replication of the input boundary.
`padding.ReplicationPad2d`	Pads the input tensor using replication of the input boundary.
`padding.ReplicationPad3d`	Pads the input tensor using replication of the input boundary.
`padding.ZeroPad1d`	Pads the input tensor boundaries with zero.
`padding.ZeroPad2d`	Pads the input tensor boundaries with zero.
`padding.ZeroPad3d`	Pads the input tensor boundaries with zero.
`padding.ConstantPad1d`	Pads the input tensor boundaries with a constant value.
`padding.ConstantPad2d`	Pads the input tensor boundaries with a constant value.
`padding.ConstantPad3d`	Pads the input tensor boundaries with a constant value.
`padding.CircularPad1d`	Pads the input tensor using circular padding of the input boundary.
`padding.CircularPad2d`	Pads the input tensor using circular padding of the input boundary.
`padding.CircularPad3d`	Pads the input tensor using circular padding of the input boundary.

Non-linear Activations (weighted sum, nonlinearity) (Aliases)#

`activation.ELU`	Applies the Exponential Linear Unit (ELU) function, element-wise.
`activation.Hardshrink`	Applies the Hard Shrinkage (Hardshrink) function element-wise.
`activation.Hardsigmoid`	Applies the Hardsigmoid function element-wise.
`activation.Hardtanh`	Applies the HardTanh function element-wise.
`activation.Hardswish`	Applies the Hardswish function, element-wise.
`activation.LeakyReLU`	Applies the LeakyReLU function element-wise.
`activation.LogSigmoid`	Applies the Logsigmoid function element-wise.
`activation.MultiheadAttention`	Allows the model to jointly attend to information from different representation subspaces.
`activation.PReLU`	Applies the element-wise PReLU function.
`activation.ReLU`	Applies the rectified linear unit function element-wise.
`activation.ReLU6`	Applies the ReLU6 function element-wise.
`activation.RReLU`	Applies the randomized leaky rectified linear unit function, element-wise.
`activation.SELU`	Applies the SELU function element-wise.
`activation.CELU`	Applies the CELU function element-wise.
`activation.GELU`	Applies the Gaussian Error Linear Units function.
`activation.Sigmoid`	Applies the Sigmoid function element-wise.
`activation.SiLU`	Applies the Sigmoid Linear Unit (SiLU) function, element-wise.
`activation.Mish`	Applies the Mish function, element-wise.
`activation.Softplus`	Applies the Softplus function element-wise.
`activation.Softshrink`	Applies the soft shrinkage function element-wise.
`activation.Softsign`	Applies the element-wise Softsign function.
`activation.Tanh`	Applies the Hyperbolic Tangent (Tanh) function element-wise.
`activation.Tanhshrink`	Applies the element-wise Tanhshrink function.
`activation.Threshold`	Thresholds each element of the input Tensor.
`activation.GLU`	Applies the gated linear unit function.

Non-linear Activations (other) (Aliases)#

`activation.Softmin`	Applies the Softmin function to an n-dimensional input Tensor.
`activation.Softmax`	Applies the Softmax function to an n-dimensional input Tensor.
`activation.Softmax2d`	Applies SoftMax over features to each spatial location.
`activation.LogSoftmax`	Applies the $\log(\text{Softmax}(x))$ function to an n-dimensional input Tensor.
`adaptive.AdaptiveLogSoftmaxWithLoss`	Efficient softmax approximation.

Normalization Layers (Aliases)#

`batchnorm.BatchNorm1d`	Applies Batch Normalization over a 2D or 3D input.
`batchnorm.BatchNorm2d`	Applies Batch Normalization over a 4D input.
`batchnorm.BatchNorm3d`	Applies Batch Normalization over a 5D input.
`batchnorm.LazyBatchNorm1d`	A `torch.nn.BatchNorm1d` module with lazy initialization.
`batchnorm.LazyBatchNorm2d`	A `torch.nn.BatchNorm2d` module with lazy initialization.
`batchnorm.LazyBatchNorm3d`	A `torch.nn.BatchNorm3d` module with lazy initialization.
`normalization.GroupNorm`	Applies Group Normalization over a mini-batch of inputs.
`batchnorm.SyncBatchNorm`	Applies Batch Normalization over a N-Dimensional input.
`instancenorm.InstanceNorm1d`	Applies Instance Normalization.
`instancenorm.InstanceNorm2d`	Applies Instance Normalization.
`instancenorm.InstanceNorm3d`	Applies Instance Normalization.
`instancenorm.LazyInstanceNorm1d`	A `torch.nn.InstanceNorm1d` module with lazy initialization of the `num_features` argument.
`instancenorm.LazyInstanceNorm2d`	A `torch.nn.InstanceNorm2d` module with lazy initialization of the `num_features` argument.
`instancenorm.LazyInstanceNorm3d`	A `torch.nn.InstanceNorm3d` module with lazy initialization of the `num_features` argument.
`normalization.LayerNorm`	Applies Layer Normalization over a mini-batch of inputs.
`normalization.LocalResponseNorm`	Applies local response normalization over an input signal.
`normalization.RMSNorm`	Applies Root Mean Square Layer Normalization over a mini-batch of inputs.

Recurrent Layers (Aliases)#

`rnn.RNNBase`	Base class for RNN modules (RNN, LSTM, GRU).
`rnn.RNN`	Apply a multi-layer Elman RNN with $\tanh$ or $\text{ReLU}$ non-linearity to an input sequence.
`rnn.LSTM`	Apply a multi-layer long short-term memory (LSTM) RNN to an input sequence.
`rnn.GRU`	Apply a multi-layer gated recurrent unit (GRU) RNN to an input sequence.
`rnn.RNNCell`	An Elman RNN cell with tanh or ReLU non-linearity.
`rnn.LSTMCell`	A long short-term memory (LSTM) cell.
`rnn.GRUCell`	A gated recurrent unit (GRU) cell.

Transformer Layers (Aliases)#

`transformer.Transformer`	A basic transformer layer.
`transformer.TransformerEncoder`	TransformerEncoder is a stack of N encoder layers.
`transformer.TransformerDecoder`	TransformerDecoder is a stack of N decoder layers.
`transformer.TransformerEncoderLayer`	TransformerEncoderLayer is made up of self-attn and feedforward network.
`transformer.TransformerDecoderLayer`	TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.

Linear Layers (Aliases)#

`linear.Identity`	A placeholder identity operator that is argument-insensitive.
`linear.Linear`	Applies an affine linear transformation to the incoming data: $y = xA^T + b$ .
`linear.Bilinear`	Applies a bilinear transformation to the incoming data: $y = x_1^T A x_2 + b$ .
`linear.LazyLinear`	A `torch.nn.Linear` module where in_features is inferred.

Dropout Layers (Aliases)#

`dropout.Dropout`	During training, randomly zeroes some of the elements of the input tensor with probability `p`.
`dropout.Dropout1d`	Randomly zero out entire channels.
`dropout.Dropout2d`	Randomly zero out entire channels.
`dropout.Dropout3d`	Randomly zero out entire channels.
`dropout.AlphaDropout`	Applies Alpha Dropout over the input.
`dropout.FeatureAlphaDropout`	Randomly masks out entire channels.

Sparse Layers (Aliases)#

`sparse.Embedding`	A simple lookup table that stores embeddings of a fixed dictionary and size.
`sparse.EmbeddingBag`	Compute sums or means of 'bags' of embeddings, without instantiating the intermediate embeddings.

Distance Functions (Aliases)#

`distance.CosineSimilarity`	Returns cosine similarity between $x_1$ and $x_2$ , computed along dim.
`distance.PairwiseDistance`	Computes the pairwise distance between input vectors, or between columns of input matrices.

Loss Functions (Aliases)#

`loss.L1Loss`	Creates a criterion that measures the mean absolute error (MAE) between each element in the input $x$ and target $y$ .
`loss.MSELoss`	Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input $x$ and target $y$ .
`loss.CrossEntropyLoss`	This criterion computes the cross entropy loss between input logits and target.
`loss.CTCLoss`	The Connectionist Temporal Classification loss.
`loss.NLLLoss`	The negative log likelihood loss.
`loss.PoissonNLLLoss`	Negative log likelihood loss with Poisson distribution of target.
`loss.GaussianNLLLoss`	Gaussian negative log likelihood loss.
`loss.KLDivLoss`	The Kullback-Leibler divergence loss.
`loss.BCELoss`	Creates a criterion that measures the Binary Cross Entropy between the target and the input probabilities:
`loss.BCEWithLogitsLoss`	This loss combines a Sigmoid layer and the BCELoss in one single class.
`loss.MarginRankingLoss`	Creates a criterion that measures the loss given inputs $x1$ , $x2$ , two 1D mini-batch or 0D Tensors, and a label 1D mini-batch or 0D Tensor $y$ (containing 1 or -1).
`loss.HingeEmbeddingLoss`	Measures the loss given an input tensor $x$ and a labels tensor $y$ (containing 1 or -1).
`loss.MultiLabelMarginLoss`	Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input $x$ (a 2D mini-batch Tensor) and output $y$ (which is a 2D Tensor of target class indices).
`loss.HuberLoss`	Creates a criterion that uses a squared term if the absolute element-wise error falls below delta and a delta-scaled L1 term otherwise.
`loss.SmoothL1Loss`	Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.
`loss.SoftMarginLoss`	Creates a criterion that optimizes a two-class classification logistic loss between input tensor $x$ and target tensor $y$ (containing 1 or -1).
`loss.MultiLabelSoftMarginLoss`	Creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input $x$ and target $y$ of size $(N, C)$ .
`loss.CosineEmbeddingLoss`	Creates a criterion that measures the loss given input tensors $x_1$ , $x_2$ and a Tensor label $y$ with values 1 or -1.
`loss.MultiMarginLoss`	Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input $x$ (a 2D mini-batch Tensor) and output $y$ (which is a 1D tensor of target class indices, $0 \leq y \leq \text{x.size}(1)-1$ ):
`loss.TripletMarginLoss`	Creates a criterion that measures the triplet loss given an input tensors $x1$ , $x2$ , $x3$ and a margin with a value greater than $0$ .
`loss.TripletMarginWithDistanceLoss`	Creates a criterion that measures the triplet loss given input tensors $a$ , $p$ , and $n$ (representing anchor, positive, and negative examples, respectively), and a nonnegative, real-valued function ("distance function") used to compute the relationship between the anchor and positive example ("positive distance") and the anchor and negative example ("negative distance").

Vision Layers (Aliases)#

`pixelshuffle.PixelShuffle`	Rearrange elements in a tensor according to an upscaling factor.
`pixelshuffle.PixelUnshuffle`	Reverse the PixelShuffle operation.
`upsampling.Upsample`	Upsamples a given multi-channel 1D (temporal), 2D (spatial) or 3D (volumetric) data.
`upsampling.UpsamplingNearest2d`	Applies a 2D nearest neighbor upsampling to an input signal composed of several input channels.
`upsampling.UpsamplingBilinear2d`	Applies a 2D bilinear upsampling to an input signal composed of several input channels.

Shuffle Layers (Aliases)#

channelshuffle.ChannelShuffle

Divides and rearranges the channels in a tensor.

torch.nn.utils#

The following are aliases to their counterparts in torch.nn.utils in nested namespaces.

Utility functions to clip parameter gradients.

`clip_grad.clip_grad_norm_`	Clip the gradient norm of an iterable of parameters.
`clip_grad.clip_grad_norm`	Clip the gradient norm of an iterable of parameters.
`clip_grad.clip_grad_value_`	Clip the gradients of an iterable of parameters at specified value.

Utility functions to flatten and unflatten Module parameters to and from a single vector.

`convert_parameters.parameters_to_vector`	Flatten an iterable of parameters into a single vector.
`convert_parameters.vector_to_parameters`	Copy slices of a vector into an iterable of parameters.

Utility functions to fuse Modules with BatchNorm modules.

`fusion.fuse_conv_bn_eval`	Fuse a convolutional module and a BatchNorm module into a single, new convolutional module.
`fusion.fuse_conv_bn_weights`	Fuse convolutional module parameters and BatchNorm module parameters into new convolutional module parameters.
`fusion.fuse_linear_bn_eval`	Fuse a linear module and a BatchNorm module into a single, new linear module.
`fusion.fuse_linear_bn_weights`	Fuse linear module parameters and BatchNorm module parameters into new linear module parameters.

Utility functions to convert Module parameter memory formats.

`memory_format.convert_conv2d_weight_memory_format`	Convert `memory_format` of `nn.Conv2d.weight` to `memory_format`.
`memory_format.convert_conv3d_weight_memory_format`	Convert `memory_format` of `nn.Conv3d.weight` to `memory_format` The conversion recursively applies to nested `nn.Module`, including `module`.

Utility functions to apply and remove weight normalization from Module parameters.

`weight_norm.weight_norm`	Apply weight normalization to a parameter in the given module.
`weight_norm.remove_weight_norm`	Remove the weight normalization reparameterization from a module.
`spectral_norm.spectral_norm`	Apply spectral normalization to a parameter in the given module.
`spectral_norm.remove_spectral_norm`	Remove the spectral normalization reparameterization from a module.

Utility functions for initializing Module parameters.

init.skip_init

Given a module class object and args / kwargs, instantiate the module without initializing parameters / buffers.