Rate this Page

torch.cuda.comm.reduce_add#

torch.cuda.comm.reduce_add(inputs, destination=None)[source]#

Sum tensors from multiple GPUs.

All inputs should have matching shapes, dtype, and layout. The output tensor will be of the same shape, dtype, and layout.

Parameters
  • inputs (Iterable[Tensor]) – an iterable of tensors to add.

  • destination (int, optional) – a device on which the output will be placed (default: current device).

Returns

A tensor containing an elementwise sum of all inputs, placed on the destination device.