Shortcuts

make_mlgym

class torchrl.envs.llm.make_mlgym(*, task: Literal['prisonersDilemma'] | None = None, tasks: list[Literal['prisonersDilemma']] | None = None, tokenizer: transformers.AutoTokenizer | str | None = None, device='cpu', reward_wrong_format: float | None = None)[source]

Wraps an MLGymEnv in a TorchRL Environment.

The appended transforms will make sure that the data is formatted for the LLM during (for the outputs of env.step) and for the MLGym API (for inputs to env.step).

Keyword Arguments:
  • task (str) –

    The task to wrap. Exclusive with tasks argument.

    Note

    The correct format is simply the task name, e.g., “prisonersDilemma”.

  • tasks (List[str]) –

    The tasks available for the env. Exclusive with task argument.

    Note

    The correct format is simply the task name, e.g., “prisonersDilemma”.

  • tokenizer (transformers.AutoTokenizer or str, optional) – A transformer that tokenizes the data. If a string is passed, it will be converted to a transformers.AutoTokenizer.

  • device (str, optional) – The device to set to the env. Defaults to “cpu”.

  • reward_wrong_format (float, optional) – The reward (negative penalty) for wrongly formatted actions. Defaults to None (no penalty).

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources