torch.cuda.comm.broadcast_coalesced¶
- torch.cuda.comm.broadcast_coalesced(tensors, devices, buffer_size=10485760)[source]¶
- Broadcast a sequence of tensors to the specified GPUs. - Small tensors are first coalesced into a buffer to reduce the number of synchronizations. - Parameters
- tensors (sequence) – tensors to broadcast. Must be on the same device, either CPU or GPU. 
- devices (Iterable[torch.device, str or int]) – an iterable of GPU devices, among which to broadcast. 
- buffer_size (int) – maximum size of the buffer used for coalescing 
 
- Returns
- A tuple containing copies of - tensor, placed on- devices.