RandomPolicy¶
- class torchrl.modules.tensordict_module.RandomPolicy(action_spec: TensorSpec | None = None, action_key: NestedKey = 'action')[source]¶
A random policy for data collectors.
This is a wrapper around the action_spec.rand method.
- Parameters:
action_spec – TensorSpec object describing the action specs. If
None, the spec is initialized lazily (e.g. by a collector fromenv.full_action_spec). ARandomPolicywith no spec will raise if called before the spec is set.action_key – key at which the action is written. Defaults to
"action".
Examples
>>> from tensordict import TensorDict >>> from torchrl.data.tensor_specs import Bounded >>> action_spec = Bounded(-torch.ones(3), torch.ones(3)) >>> actor = RandomPolicy(action_spec=action_spec) >>> td = actor(TensorDict()) # selects a random action in the cube [-1; 1]
Lazy initialization — let the collector fill in the spec from the env:
>>> from torchrl.collectors import SyncDataCollector >>> collector = SyncDataCollector(env, RandomPolicy(), ...)