SubgroupAccuracyDifference#
- class ignite.metrics.fairness.SubgroupAccuracyDifference(groups, is_multilabel=False, output_transform=<function SubgroupAccuracyDifference.<lambda>>, device=device(type='cpu'))[source]#
Calculates the Subgroup Accuracy Difference.
This metric computes the accuracy for each unique subgroup in the dataset and returns the maximum difference in accuracy between any two subgroups. It is a strict measure of how disparate the performance of a model is across different categorical segments.
This metric is referred to as Overall Accuracy Equality in the fairness literature.
updatemust receive output of the form(y_pred, y, group_labels)or{'y_pred': y_pred, 'y': y, 'group_labels': group_labels}.y_pred must be in the following shape (batch_size, num_categories, …) or (batch_size, …).
y must be in the following shape (batch_size, …).
group_labels must be a 1D tensor of shape (batch_size,) containing discrete labels.
- Parameters:
groups (Sequence[int]) – a sequence of unique group identifiers.
is_multilabel (bool) – if True, multilabel accuracy is calculated. By default, False.
output_transform (Callable) – a callable that is used to transform the
Engine’sprocess_function’s output into the form expected by the metric.device (device | str) – specifies which device updates are accumulated on. Setting the metric’s device to be the same as your
updatearguments ensures theupdatemethod is non-blocking. By default, CPU.
Examples
To use with
Engineandprocess_function, simply attach the metric instance to the engine. The output of the engine’sprocess_functionneeds to be in the format of(y_pred, y, group_labels).from collections import OrderedDict import torch from torch import nn, optim from ignite.engine import * from ignite.handlers import * from ignite.metrics import * from ignite.metrics.clustering import * from ignite.metrics.fairness import * from ignite.metrics.rec_sys import * from ignite.metrics.regression import * from ignite.utils import * # create default evaluator for doctests def eval_step(engine, batch): return batch default_evaluator = Engine(eval_step) # create default optimizer for doctests param_tensor = torch.zeros([1], requires_grad=True) default_optimizer = torch.optim.SGD([param_tensor], lr=0.1) # create default trainer for doctests # as handlers could be attached to the trainer, # each test must define his own trainer using `.. testsetup:` def get_default_trainer(): def train_step(engine, batch): return batch return Engine(train_step) # create default model for doctests default_model = nn.Sequential(OrderedDict([ ('base', nn.Linear(4, 2)), ('fc', nn.Linear(2, 1)) ])) manual_seed(666)
metric = SubgroupAccuracyDifference(groups=[0, 1]) metric.attach(default_evaluator, 'subgroup_acc_diff') # Predictions for 4 items: # Items 1 and 3 are predicted as class 0 (index 0 has highest prob) # Items 2 and 4 are predicted as class 1 (index 1 has highest prob) y_pred = torch.tensor([[0.9, 0.1], [0.1, 0.9], [0.8, 0.2], [0.2, 0.8]]) # Targets y_true = torch.tensor([0, 1, 1, 0]) # Subgroups (e.g., 0=Demographic A, 1=Demographic B) group_labels = torch.tensor([0, 0, 1, 1]) # Subgroup 0: 2 correct predictions, accuracy = 100% # Subgroup 1: 0 correct predictions, accuracy = 0% state = default_evaluator.run([[y_pred, y_true, group_labels]]) print(state.metrics['subgroup_acc_diff'])
1.0
New in version 0.5.4.
References
Verma & Rubin, Fairness Definitions Explained, 2018.
Methods