SWALR¶
- class torch.optim.swa_utils.SWALR(optimizer, swa_lr, anneal_epochs=10, anneal_strategy='cos', last_epoch=-1)[source][source]¶
- Anneals the learning rate in each parameter group to a fixed value. - This learning rate scheduler is meant to be used with Stochastic Weight Averaging (SWA) method (see torch.optim.swa_utils.AveragedModel). - Parameters
- optimizer (torch.optim.Optimizer) – wrapped optimizer 
- swa_lrs (float or list) – the learning rate value for all param groups together or separately for each group. 
- annealing_epochs (int) – number of epochs in the annealing phase (default: 10) 
- annealing_strategy (str) – “cos” or “linear”; specifies the annealing strategy: “cos” for cosine annealing, “linear” for linear annealing (default: “cos”) 
- last_epoch (int) – the index of the last epoch (default: -1) 
 
 - The - SWALRscheduler can be used together with other schedulers to switch to a constant learning rate late in the training as in the example below.- Example - >>> loader, optimizer, model = ... >>> lr_lambda = lambda epoch: 0.9 >>> scheduler = torch.optim.lr_scheduler.MultiplicativeLR(optimizer, >>> lr_lambda=lr_lambda) >>> swa_scheduler = torch.optim.swa_utils.SWALR(optimizer, >>> anneal_strategy="linear", anneal_epochs=20, swa_lr=0.05) >>> swa_start = 160 >>> for i in range(300): >>> for input, target in loader: >>> optimizer.zero_grad() >>> loss_fn(model(input), target).backward() >>> optimizer.step() >>> if i > swa_start: >>> swa_scheduler.step() >>> else: >>> scheduler.step() - load_state_dict(state_dict)[source]¶
- Load the scheduler’s state. - Parameters
- state_dict (dict) – scheduler state. Should be an object returned from a call to - state_dict().