Rate this Page

Value Networks and Critics#

Value networks estimate the value of states or state-action pairs.

ValueOperator(*args, **kwargs)

General class for value functions in RL.

ValueNorm(*[, shape, epsilon, device])

Abstract base class for value normalisers.

PopArtValueNorm(*[, shape, beta, epsilon, ...])

PopArt-style EMA value normaliser.

RunningValueNorm(*[, shape, epsilon, device])

Exact running mean / variance (Welford's online algorithm).

DuelingCnnDQNet(out_features[, ...])

Dueling CNN Q-network.

DistributionalDQNnet(*args, **kwargs)

Distributional Deep Q-Network softmax layer.

ConvNet(in_features, depth, num_cells, ...)

A convolutional neural network.

CrossCriticGroupSpec(obs_dim, n_agents, ...)

Specification for one agent group used by CrossGroupCritic.

CrossGroupCritic(*args, **kwargs)

Centralised critic that conditions on observations from multiple agent groups.

MLP(in_features, out_features, depth, ...)

A multi-layer perceptron.

DdpgCnnActor(action_dim[, conv_net_kwargs, ...])

DDPG Convolutional Actor class.

DdpgCnnQNet([conv_net_kwargs, ...])

DDPG Convolutional Q-value class.

DdpgMlpActor(action_dim[, mlp_net_kwargs, ...])

DDPG Actor class.

DdpgMlpQNet([mlp_net_kwargs_net1, ...])

DDPG Q-value MLP class.

LSTMModule(*args, **kwargs)

An embedder for an LSTM module.

GRUModule(*args, **kwargs)

An embedder for an GRU module.

canonicalize_rnn_subset(data, modules, *[, ...])

Canonicalize only the union of RNN keys used by modules.

set_recurrent_mode([mode])

Context manager for setting RNNs recurrent mode.

OnlineDTActor(state_dim, action_dim[, ...])

Online Decision Transformer Actor class.

DTActor(state_dim, action_dim[, ...])

Decision Transformer Actor class.

DecisionTransformer(state_dim, action_dim[, ...])

Online Decision Transformer.