LinearCrossEntropyLoss#
- class torch.nn.LinearCrossEntropyLoss(in_features, num_classes, *, out_features=(), device=None, dtype=None, reduction='mean', weight=None, ignore_index=None, label_smoothing=0.0)[source]#
This criterion computes the cross entropy loss between input, linearly transformed to logits, and target.
See
CrossEntropyLossfor the definition of cross entropy loss.- Parameters:
in_features (int) – Size of each input sample.
num_classes (int) – Number of classes, .
out_features (tuple[int], optional) – specifies dimensions for K-dimensional loss. Default:
().device (
torch.device, optional) – the desired device of linear weight. Default:None.dtype (
torch.dtype, optional) – the desired dtype of linear weight. Default:None.weight (Tensor, optional) – a manual rescaling weight given to each class. If given, has to be a Tensor of size C.
reduction (str, optional) – Specifies the reduction to apply to the output:
'none'|'mean'|'sum'.'none': no reduction will be applied,'mean': the weighted mean of the output is taken,'sum': the output will be summed. Default:'mean'.ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. Note that
ignore_indexis only applicable when the target contains class indices. Default: None. When target contains class indices, the default value is mapped to -100. Note: the defaultignore_indexinCrossEntropyLossis -100 for both target types.label_smoothing (float, optional) – A float in [0.0, 1.0]. Specifies the amount of smoothing when computing the loss, where 0.0 means no smoothing. The targets become a mixture of the original ground truth and a uniform distribution as described in Rethinking the Inception Architecture for Computer Vision. Default: .
- Shape:
Input: Shape , .
Target: If containing class indices, shape , or where each value should be between . The target data type is required to be long when using class indices. If containing class probabilities, the target must have shape or , and each value should be between . This means the target data type is required to be float when using class probabilities. Note that PyTorch does not strictly enforce probability constraints on the class probabilities and that it is the user’s responsibility to ensure
targetcontains valid probability distributions (see below examples section for more details).Output: If reduction is ‘none’, shape , or depending on the shape of the input. Otherwise, scalar.
where is batch size.
Examples
>>> torch.manual_seed(283) >>> # Example of target with class indices >>> loss = nn.LinearCrossEntropyLoss(5, 10, out_features=(4, 3)) >>> input = torch.randn(2, 5, requires_grad=True) >>> target = torch.randint(0, 10, (2, 4, 3)) >>> output = loss(input, target) >>> output.backward() >>> >>> # Example of target with class probabilities >>> input = torch.randn(2, 5, requires_grad=True) >>> target = torch.randn(2, 10, 4, 3).softmax(dim=1) >>> output = loss(input, target) >>> output.backward()