ValueNorm#
- class torchrl.modules.ValueNorm(*, shape: int | tuple[int, ...] = 1, epsilon: float = 1e-05, device: device | None = None)[source]#
Abstract base class for value normalisers.
A value normaliser keeps a running estimate of the location and scale of the value target seen during training. Critics use it to:
normalize the regression target before computing MSE, keeping the critic loss on a fixed scale across episodes / reward inflations;
denormalize the critic’s output back to the real reward scale when forming bootstrapped value estimates inside GAE / TD.
Subclasses must implement
update(),normalize(), anddenormalize(). The convention is that all three operate on tensors whose trailing dims matchshape(the per-element value shape, usually(1,)).- abstract denormalize(normalised_value: Tensor) Tensor[source]#
Inverse of
normalize()— recover real-scale values.