Shortcuts

VLLMWeightSender

class torchrl.weight_update.llm.VLLMWeightSender(scheme: VLLMWeightSyncScheme)[source]

Sends weights to vLLM workers using collective communication.

RPC + Collective Implementation

This class implements both layers:

  1. RPC Layer: Currently uses Ray remote calls (implicit in test setup) - Can be extended to other RPC backends (torch.distributed.rpc, gRPC) - In the test, Ray actors provide the RPC mechanism

  2. Collective Layer: Uses VLLMCollectiveTransport for NCCL broadcast - Broadcasts weights from trainer (rank 0) to workers (ranks 1+) - High-bandwidth GPU-to-GPU transfer

Extending RPC Backends

To use a different RPC backend, subclass and override coordination:

class TorchRPCVLLMSender(VLLMWeightSender):
    def update_weights(self, weights=None):
        # Custom RPC: Signal workers to prepare
        for worker in self.workers:
            torch.distributed.rpc.rpc_async(worker, "prepare_receive")

        # Then do collective (unchanged)
        super().update_weights(weights)
init_all_workers_group(model_metadata: dict[str, tuple[torch.dtype, torch.Size]], vllm_engine: Any | None = None)[source]

Initialize the collective communication group.

Parameters:
  • model_metadata – Dict mapping param names to (dtype, shape) tuples.

  • vllm_engine – Optional vLLM engine for RPC coordination. Required for NCCL broadcasts.

register_model(model: Any) None[source]

Register the model to extract weights from.

send(weights: Any = None, worker_ids: int | list[int] | None = None) None

Send weights synchronously to workers.

This method: 1. Prepares weights (extracts from model if weights=None) 2. Sends to specified workers (or all if worker_ids=None) 3. Waits for acknowledgments from those workers 4. Returns when workers have applied the weights

Parameters:
  • weights – Weights to send. Can be: - None: Extract from model via context.get_model(model_id) - nn.Module: Extract weights from module - TensorDict: Use directly - dict: Convert to TensorDict

  • worker_ids – Which workers to send to: - None: Send to all workers (default) - int: Send to single worker - list[int]: Send to specific workers

Note: This is a blocking call that ensures specified workers are updated before returning.

send_async(weights: Any = None, worker_ids: int | list[int] | None = None) None

Send weights asynchronously to workers (non-blocking).

This initiates the send but returns immediately without waiting for workers to acknowledge. You must call wait_async() before the next send_async() or send() call.

Parameters:
  • weights – Same as send()

  • worker_ids – Same as send()

Raises:

RuntimeError – If a previous send_async() is still pending

update_weights(weights: Any | None = None) None[source]

Extract and broadcast weights to vLLM workers.

Parameters:

weights – Optional weights to send. If None, extracts from registered model.

wait_async() None

Wait for a pending async send to complete.

Blocks until all workers have acknowledged the previous send_async(). This must be called after send_async() before any subsequent sends.

Raises:

RuntimeError – If no async send is pending

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources