torch.cuda.comm.scatter¶
- torch.cuda.comm.scatter(tensor, devices=None, chunk_sizes=None, dim=0, streams=None, *, out=None)[source]¶
- Scatters tensor across multiple GPUs. - Parameters
- tensor (Tensor) – tensor to scatter. Can be on CPU or GPU. 
- devices (Iterable[torch.device, str or int], optional) – an iterable of GPU devices, among which to scatter. 
- chunk_sizes (Iterable[int], optional) – sizes of chunks to be placed on each device. It should match - devicesin length and sums to- tensor.size(dim). If not specified,- tensorwill be divided into equal chunks.
- dim (int, optional) – A dimension along which to chunk - tensor. Default:- 0.
- streams (Iterable[torch.cuda.Stream], optional) – an iterable of Streams, among which to execute the scatter. If not specified, the default stream will be utilized. 
- out (Sequence[Tensor], optional, keyword-only) – the GPU tensors to store output results. Sizes of these tensors must match that of - tensor, except for- dim, where the total size must sum to- tensor.size(dim).
 
 - Note - Exactly one of - devicesand- outmust be specified. When- outis specified,- chunk_sizesmust not be specified and will be inferred from sizes of- out.- Returns
- If devicesis specified,
- a tuple containing chunks of - tensor, placed on- devices.
 
- If 
- If outis specified,
- a tuple containing - outtensors, each containing a chunk of- tensor.
 
- If