ActionScaling#
- class torchrl.envs.transforms.ActionScaling(in_keys_inv: Sequence[NestedKey] | None = None, out_keys_inv: Sequence[NestedKey] | None = None, in_keys: Sequence[NestedKey] | None = None, out_keys: Sequence[NestedKey] | None = None, *, loc: Tensor | float | None = None, scale: Tensor | float | None = None, standard_normal: bool = True)[source]#
Affine-scale a continuous action using the bounds of the action spec.
Given a bounded action spec with bounds
[low, high], this transform exposes a normalized action space to the policy and rescales actions back to the original env range before they are passed to the environment.The
locandscaleare derived from the spec:\[loc = \frac{high + low}{2}, \quad scale = \frac{high - low}{2}.\]When
standard_normal=True(default) the normalized action space is[-1, 1]and the inverse mapping (policy action -> env action) is\[a_{env} = a_{norm} \cdot scale + loc.\]The forward mapping (env action -> normalized action, used by replay buffer transforms) is the inverse:
\[a_{norm} = (a_{env} - loc) / scale.\]When
standard_normal=Falsethe normalized space is[0, 1]and the mapping is rescaled accordingly so that0maps tolowand1tohigh.- Parameters:
in_keys_inv (sequence of NestedKey, optional) – keys read during the
invdirection (policy -> env). Defaults to["action"]. A single key perActionScalinginstance is supported; compose several instances to scale several actions.out_keys_inv (sequence of NestedKey, optional) – keys written during the
invdirection. Defaults toin_keys_inv.in_keys (sequence of NestedKey, optional) – keys read during the forward direction (env action -> normalized action, used by replay buffers and inside
Modulechains). Defaults toin_keys_inv.out_keys (sequence of NestedKey, optional) – keys written during the forward direction. Defaults to
in_keys.
- Keyword Arguments:
loc (torch.Tensor or float, optional) – explicit location of the affine transform. If both
locandscaleare provided the values are used as-is and no derivation from the spec is performed (useful when no parent environment is available, e.g. inside a replay buffer). Defaults toNone.scale (torch.Tensor or float, optional) – explicit scale of the affine transform. Must be provided together with
loc. Defaults toNone.standard_normal (bool, optional) – if
True(default), the normalized action space is[-1, 1]. IfFalse, the normalized action space is[0, 1].
- Raises:
RuntimeError – if the action spec is unbounded or partially unbounded (any bound is non-finite).
Examples
>>> import torch >>> from torchrl.data.tensor_specs import Bounded >>> from torchrl.envs.transforms import ActionScaling, TransformedEnv >>> from torchrl.testing.mocking_classes import ContinuousActionVecMockEnv >>> base_env = ContinuousActionVecMockEnv( ... action_spec=Bounded(low=-2.0, high=4.0, shape=(7,)) ... ) >>> env = TransformedEnv(base_env, ActionScaling()) >>> env.action_spec.space.low tensor([-1., -1., -1., -1., -1., -1., -1.]) >>> env.action_spec.space.high tensor([1., 1., 1., 1., 1., 1., 1.])
- transform_action_spec(action_spec: TensorSpec) TensorSpec[source]#
Transforms the action spec such that the resulting spec matches transform mapping.
- Parameters:
action_spec (TensorSpec) – spec before the transform
- Returns:
expected spec after the transform