UTDRHook¶

class torchrl.trainers.UTDRHook(trainer: Trainer)[source]¶

Hook for logging Update-to-Data (UTD) ratio during async collection.

The UTD ratio measures how many optimization steps are performed per collected data sample, providing insight into training efficiency during asynchronous data collection. This metric is particularly useful for off-policy algorithms where data collection and training happen concurrently.

The UTD ratio is calculated as: (batch_size * update_count) / write_count where: - batch_size: Size of batches sampled from replay buffer - update_count: Total number of optimization steps performed - write_count: Total number of samples written to replay buffer

Parameters:: trainer (Trainer) – The trainer instance to monitor for UTD calculation. Must have async_collection=True for meaningful results.

Note

This hook is only meaningful when async_collection is enabled, as it relies on the replay buffer’s write_count to track data collection progress.

load_state_dict(state_dict: dict[str, Any]) → None[source]¶: Load state from dictionary.

register(trainer: Trainer, name: str = 'utdr_hook')[source]¶

Register the UTD ratio hook with the trainer.

Parameters:

trainer (Trainer) – The trainer to register with.
name (str) – Name to use when registering the hook module.

state_dict() → dict[str, Any][source]¶: Return state dictionary for checkpointing.

UTDRHook¶

Docs

Tutorials

Resources