torch.nn.functional.kl_div¶
-
torch.nn.functional.kl_div(input, target, size_average=None, reduce=None, reduction='mean', log_target=False)[source]¶ The Kullback-Leibler divergence Loss
See
KLDivLossfor details.- Parameters
input – Tensor of arbitrary shape in log-probabilities.
target – Tensor of the same shape as input. See
log_targetfor the target’s interpretation.size_average (bool, optional) – Deprecated (see
reduction). By default, the losses are averaged over each loss element in the batch. Note that for some losses, there multiple elements per sample. If the fieldsize_averageis set toFalse, the losses are instead summed for each minibatch. Ignored when reduce isFalse. Default:Truereduce (bool, optional) – Deprecated (see
reduction). By default, the losses are averaged or summed over observations for each minibatch depending onsize_average. WhenreduceisFalse, returns a loss per batch element instead and ignoressize_average. Default:Truereduction (string, optional) – Specifies the reduction to apply to the output:
'none'|'batchmean'|'sum'|'mean'.'none': no reduction will be applied'batchmean': the sum of the output will be divided by the batchsize'sum': the output will be summed'mean': the output will be divided by the number of elements in the output Default:'mean'log_target (bool) – A flag indicating whether
targetis passed in the log space. It is recommended to pass certain distributions (likesoftmax) in the log space to avoid numerical issues caused by explicitlog. Default:False
Note
size_averageandreduceare in the process of being deprecated, and in the meantime, specifying either of those two args will overridereduction.Note
reduction='mean'doesn’t return the true kl divergence value, please usereduction='batchmean'which aligns with KL math definition. In the next major release,'mean'will be changed to be the same as ‘batchmean’.