EmbeddingBag#
- class torch.nn.modules.sparse.EmbeddingBag(num_embeddings, embedding_dim, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, mode='mean', sparse=False, _weight=None, include_last_offset=False, padding_idx=None, device=None, dtype=None)[source]#
Compute sums or means of ‘bags’ of embeddings, without instantiating the intermediate embeddings.
For bags of constant length, no
per_sample_weights, no indices equal topadding_idx, and with 2D inputs, this classHowever,
EmbeddingBagis much more time and memory efficient than using a chain of these operations.EmbeddingBag also supports per-sample weights as an argument to the forward pass. This scales the output of the Embedding before performing a weighted reduction as specified by
mode. Ifper_sample_weightsis passed, the only supportedmodeis"sum", which computes a weighted sum according toper_sample_weights.- Parameters
num_embeddings (int) – size of the dictionary of embeddings
embedding_dim (int) – the size of each embedding vector
max_norm (float, optional) – If given, each embedding vector with norm larger than
max_normis renormalized to have normmax_norm.norm_type (float, optional) – The p of the p-norm to compute for the
max_normoption. Default2.scale_grad_by_freq (bool, optional) – if given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default
False. Note: this option is not supported whenmode="max".mode (str, optional) –
"sum","mean"or"max". Specifies the way to reduce the bag."sum"computes the weighted sum, takingper_sample_weightsinto consideration."mean"computes the average of the values in the bag,"max"computes the max value over each bag. Default:"mean"sparse (bool, optional) – if
True, gradient w.r.t.weightmatrix will be a sparse tensor. See Notes for more details regarding sparse gradients. Note: this option is not supported whenmode="max".include_last_offset (bool, optional) – if
True,offsetshas one additional element, where the last element is equivalent to the size of indices. This matches the CSR format.padding_idx (int, optional) – If specified, the entries at
padding_idxdo not contribute to the gradient; therefore, the embedding vector atpadding_idxis not updated during training, i.e. it remains as a fixed “pad”. For a newly constructed EmbeddingBag, the embedding vector atpadding_idxwill default to all zeros, but can be updated to another value to be used as the padding vector. Note that the embedding vector atpadding_idxis excluded from the reduction.
- Variables
weight (Tensor) – the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from .
Examples:
>>> # an EmbeddingBag module containing 10 tensors of size 3 >>> embedding_sum = nn.EmbeddingBag(10, 3, mode='sum') >>> # a batch of 2 samples of 4 indices each >>> input = torch.tensor([1, 2, 4, 5, 4, 3, 2, 9], dtype=torch.long) >>> offsets = torch.tensor([0, 4], dtype=torch.long) >>> embedding_sum(input, offsets) tensor([[-0.8861, -5.4350, -0.0523], [ 1.1306, -2.5798, -1.0044]]) >>> # Example with padding_idx >>> embedding_sum = nn.EmbeddingBag(10, 3, mode='sum', padding_idx=2) >>> input = torch.tensor([2, 2, 2, 2, 4, 3, 2, 9], dtype=torch.long) >>> offsets = torch.tensor([0, 4], dtype=torch.long) >>> embedding_sum(input, offsets) tensor([[ 0.0000, 0.0000, 0.0000], [-0.7082, 3.2145, -2.6251]]) >>> # An EmbeddingBag can be loaded from an Embedding like so >>> embedding = nn.Embedding(10, 3, padding_idx=2) >>> embedding_sum = nn.EmbeddingBag.from_pretrained( embedding.weight, padding_idx=embedding.padding_idx, mode='sum')
- forward(input, offsets=None, per_sample_weights=None)[source]#
Forward pass of EmbeddingBag.
- Parameters
input (Tensor) – Tensor containing bags of indices into the embedding matrix.
offsets (Tensor, optional) – Only used when
inputis 1D.offsetsdetermines the starting index position of each bag (sequence) ininput.per_sample_weights (Tensor, optional) – a tensor of float / double weights, or None to indicate all weights should be taken to be
1. If specified,per_sample_weightsmust have exactly the same shape as input and is treated as having the sameoffsets, if those are notNone. Only supported formode='sum'.
- Returns
Tensor output shape of (B, embedding_dim).
- Return type
Note
A few notes about
inputandoffsets:inputandoffsetshave to be of the same type, either int or longIf
inputis 2D of shape (B, N), it will be treated asBbags (sequences) each of fixed lengthN, and this will returnBvalues aggregated in a way depending on themode.offsetsis ignored and required to beNonein this case.If
inputis 1D of shape (N), it will be treated as a concatenation of multiple bags (sequences).offsetsis required to be a 1D tensor containing the starting index positions of each bag ininput. Therefore, foroffsetsof shape (B),inputwill be viewed as havingBbags. Empty bags (i.e., having 0-length) will have returned vectors filled by zeros.
- classmethod from_pretrained(embeddings, freeze=True, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, mode='mean', sparse=False, include_last_offset=False, padding_idx=None)[source]#
Create EmbeddingBag instance from given 2-dimensional FloatTensor.
- Parameters
embeddings (Tensor) – FloatTensor containing weights for the EmbeddingBag. First dimension is being passed to EmbeddingBag as ‘num_embeddings’, second as ‘embedding_dim’.
freeze (bool, optional) – If
True, the tensor does not get updated in the learning process. Equivalent toembeddingbag.weight.requires_grad = False. Default:Truemax_norm (float, optional) – See module initialization documentation. Default:
Nonenorm_type (float, optional) – See module initialization documentation. Default
2.scale_grad_by_freq (bool, optional) – See module initialization documentation. Default
False.mode (str, optional) – See module initialization documentation. Default:
"mean"sparse (bool, optional) – See module initialization documentation. Default:
False.include_last_offset (bool, optional) – See module initialization documentation. Default:
False.padding_idx (int, optional) – See module initialization documentation. Default:
None.
- Return type
Examples:
>>> # FloatTensor containing pretrained weights >>> weight = torch.FloatTensor([[1, 2.3, 3], [4, 5.1, 6.3]]) >>> embeddingbag = nn.EmbeddingBag.from_pretrained(weight) >>> # Get embeddings for index 1 >>> input = torch.LongTensor([[1, 0]]) >>> embeddingbag(input) tensor([[ 2.5000, 3.7000, 4.6500]])