torch.optim.Optimizer.state_dict¶
- Optimizer.state_dict()[source][source]¶
- Return the state of the optimizer as a - dict.- It contains two entries: - state: a Dict holding current optimization state. Its content
- differs between optimizer classes, but some common characteristics hold. For example, state is saved per parameter, and the parameter itself is NOT saved. - stateis a Dictionary mapping parameter ids to a Dict with state corresponding to each parameter.
 
- param_groups: a List containing all parameter groups where each
- parameter group is a Dict. Each parameter group contains metadata specific to the optimizer, such as learning rate and weight decay, as well as a List of parameter IDs of the parameters in the group. If a param group was initialized with - named_parameters()the names content will also be saved in the state dict.
 
 - NOTE: The parameter IDs may look like indices but they are just IDs associating state with param_group. When loading from a state_dict, the optimizer will zip the param_group - params(int IDs) and the optimizer- param_groups(actual- nn.Parameters) in order to match state WITHOUT additional verification.- A returned state dict might look something like: - { 'state': { 0: {'momentum_buffer': tensor(...), ...}, 1: {'momentum_buffer': tensor(...), ...}, 2: {'momentum_buffer': tensor(...), ...}, 3: {'momentum_buffer': tensor(...), ...} }, 'param_groups': [ { 'lr': 0.01, 'weight_decay': 0, ... 'params': [0] 'param_names' ['param0'] (optional) }, { 'lr': 0.001, 'weight_decay': 0.5, ... 'params': [1, 2, 3] 'param_names': ['param1', 'layer.weight', 'layer.bias'] (optional) } ] }