RandomPolicy

class torchrl.modules.tensordict_module.RandomPolicy(action_spec: TensorSpec | None = None, action_key: NestedKey = 'action')[source]

A random policy for data collectors.

This is a wrapper around the action_spec.rand method.

Parameters:
  • action_spec – TensorSpec object describing the action specs. If None, the spec is initialized lazily (e.g. by a collector from env.full_action_spec). A RandomPolicy with no spec will raise if called before the spec is set.

  • action_key – key at which the action is written. Defaults to "action".

Examples

>>> from tensordict import TensorDict
>>> from torchrl.data.tensor_specs import Bounded
>>> action_spec = Bounded(-torch.ones(3), torch.ones(3))
>>> actor = RandomPolicy(action_spec=action_spec)
>>> td = actor(TensorDict()) # selects a random action in the cube [-1; 1]

Lazy initialization — let the collector fill in the spec from the env:

>>> from torchrl.collectors import SyncDataCollector
>>> collector = SyncDataCollector(env, RandomPolicy(), ...)  
set_action_spec_from_env(env: EnvBase) None[source]

Initialize action_spec from env.full_action_spec.

No-op if the spec is already set. Intended for lazy initialization by data collectors.

Docs

Lorem ipsum dolor sit amet, consectetur

View Docs

Tutorials

Lorem ipsum dolor sit amet, consectetur

View Tutorials

Resources

Lorem ipsum dolor sit amet, consectetur

View Resources