SubgroupAccuracyDifference#

class ignite.metrics.fairness.SubgroupAccuracyDifference(groups, is_multilabel=False, output_transform=<function SubgroupAccuracyDifference.<lambda>>, device=device(type='cpu'))[source]#

Calculates the Subgroup Accuracy Difference.

This metric computes the accuracy for each unique subgroup in the dataset and returns the maximum difference in accuracy between any two subgroups. It is a strict measure of how disparate the performance of a model is across different categorical segments.

This metric is referred to as Overall Accuracy Equality in the fairness literature.

update must receive output of the form (y_pred, y, group_labels) or {'y_pred': y_pred, 'y': y, 'group_labels': group_labels}.
y_pred must be in the following shape (batch_size, num_categories, …) or (batch_size, …).
y must be in the following shape (batch_size, …).
group_labels must be a 1D tensor of shape (batch_size,) containing discrete labels.

Parameters:

groups (Sequence[int]) – a sequence of unique group identifiers.
is_multilabel (bool) – if True, multilabel accuracy is calculated. By default, False.
output_transform (Callable) – a callable that is used to transform the Engine’s process_function’s output into the form expected by the metric.
device (device | str) – specifies which device updates are accumulated on. Setting the metric’s device to be the same as your update arguments ensures the update method is non-blocking. By default, CPU.

Examples

To use with Engine and process_function, simply attach the metric instance to the engine. The output of the engine’s process_function needs to be in the format of (y_pred, y, group_labels).

from collections import OrderedDict

import torch
from torch import nn, optim

from ignite.engine import *
from ignite.handlers import *
from ignite.metrics import *
from ignite.metrics.clustering import *
from ignite.metrics.fairness import *
from ignite.metrics.rec_sys import *
from ignite.metrics.regression import *
from ignite.utils import *

# create default evaluator for doctests

def eval_step(engine, batch):
    return batch

default_evaluator = Engine(eval_step)

# create default optimizer for doctests

param_tensor = torch.zeros([1], requires_grad=True)
default_optimizer = torch.optim.SGD([param_tensor], lr=0.1)

# create default trainer for doctests
# as handlers could be attached to the trainer,
# each test must define his own trainer using `.. testsetup:`

def get_default_trainer():

    def train_step(engine, batch):
        return batch

    return Engine(train_step)

# create default model for doctests

default_model = nn.Sequential(OrderedDict([
    ('base', nn.Linear(4, 2)),
    ('fc', nn.Linear(2, 1))
]))

manual_seed(666)

metric = SubgroupAccuracyDifference(groups=[0, 1])
metric.attach(default_evaluator, 'subgroup_acc_diff')

# Predictions for 4 items:
# Items 1 and 3 are predicted as class 0 (index 0 has highest prob)
# Items 2 and 4 are predicted as class 1 (index 1 has highest prob)
y_pred = torch.tensor([[0.9, 0.1], [0.1, 0.9], [0.8, 0.2], [0.2, 0.8]])

# Targets
y_true = torch.tensor([0, 1, 1, 0])

# Subgroups (e.g., 0=Demographic A, 1=Demographic B)
group_labels = torch.tensor([0, 0, 1, 1])

# Subgroup 0: 2 correct predictions, accuracy = 100%
# Subgroup 1: 0 correct predictions, accuracy = 0%

state = default_evaluator.run([[y_pred, y_true, group_labels]])
print(state.metrics['subgroup_acc_diff'])

1.0

New in version 0.5.4.

References

Verma & Rubin, Fairness Definitions Explained, 2018.

Methods

SubgroupAccuracyDifference#

Search Docs