Rate this Page

UniformActionTokenizer#

class torchrl.data.vla.UniformActionTokenizer(num_bins: int, *, low: float | Tensor, high: float | Tensor, action_dim: int | None = None)[source]#

Per-dimension uniform-bin action tokenizer (RT-2 / OpenVLA style).

Each action dimension is discretized into num_bins equal-width bins over [low, high]; encode() returns the bin index and decode() returns the bin center. The round-trip is lossy with error bounded by half a bin width, (high - low) / (2 * num_bins).

Parameters:

num_bins (int) – number of bins per action dimension.

Keyword Arguments:
  • low (float or torch.Tensor) – per-dimension lower bound. Actions are clamped to [low, high] before binning.

  • high (float or torch.Tensor) – per-dimension upper bound.

  • action_dim (int, optional) – action dimensionality. Required only when low/high are scalars and you want a per-dimension shape.

Examples

>>> import torch
>>> from torchrl.data.vla import UniformActionTokenizer
>>> tok = UniformActionTokenizer(256, low=-1.0, high=1.0)
>>> tokens = tok.encode(torch.tensor([-1.0, 0.0, 1.0]))
>>> tokens
tensor([  0, 128, 255])
>>> torch.allclose(tok.decode(tokens), torch.tensor([-0.998, 0.002, 0.998]), atol=1e-2)
True
>>> tok.vocab_size
256

See also

RobotDatasetMetadata carries the action_low/action_high bounds used by from_metadata().

property action_dim: int | None#

The per-dimension action size, or None for scalar bounds.

decode(tokens: Tensor) Tensor[source]#

Map token ids back to continuous actions [..., action_dim].

encode(actions: Tensor) Tensor[source]#

Map continuous actions [..., action_dim] to token ids (long).

classmethod from_metadata(metadata: RobotDatasetMetadata, num_bins: int) UniformActionTokenizer[source]#

Build from the action_low/action_high of a RobotDatasetMetadata.

property vocab_size: int#

Number of distinct token ids the tokenizer can emit per position.