ClassificationReport#

ignite.metrics.ClassificationReport(beta=1, output_dict=False, output_transform=<function <lambda>>, device=device(type='cpu'), is_multilabel=False, labels=None, metrics_result_mode='both')[source]#

Build a text report showing the main classification metrics. The report resembles in functionality to scikit-learn classification_report The underlying implementation doesn’t use the sklearn function.

Parameters:

beta (int) – weight of precision in harmonic mean
output_dict (bool) – If True, return output as dict, otherwise return a str
output_transform (Callable) – a callable that is used to transform the Engine’s process_function’s output into the form expected by the metric. This can be useful if, for example, you have a multi-output model and you want to compute the metric with respect to one of the outputs.
is_multilabel (bool) – If True, the tensors are assumed to be multilabel.
device (str | device) – optional device specification for internal storage.
labels (list[str] | None) – Optional list of label indices to include in the report
metrics_result_mode (Literal['flatten', 'named', 'both']) – specifies how to put the computed metrics results into engine.state.metrics dictionary. Valid values are: “flatten”, “named”, “both”. - “flatten”: if the computed result is a mapping, its keys/values are put directly into the engine state metrics dictionary - “named”: if the computed result is a mapping, the whole mapping is put into the engine state metrics dictionary under the metric name - “both”: combination of “flatten” and “named”.

Return type:

MetricsLambda

Examples

For more information on how metric works with Engine, visit Attach Engine API.

from collections import OrderedDict

import torch
from torch import nn, optim

from ignite.engine import *
from ignite.handlers import *
from ignite.metrics import *
from ignite.metrics.clustering import *
from ignite.metrics.fairness import *
from ignite.metrics.rec_sys import *
from ignite.metrics.regression import *
from ignite.utils import *

# create default evaluator for doctests

def eval_step(engine, batch):
    return batch

default_evaluator = Engine(eval_step)

# create default optimizer for doctests

param_tensor = torch.zeros([1], requires_grad=True)
default_optimizer = torch.optim.SGD([param_tensor], lr=0.1)

# create default trainer for doctests
# as handlers could be attached to the trainer,
# each test must define his own trainer using `.. testsetup:`

def get_default_trainer():

    def train_step(engine, batch):
        return batch

    return Engine(train_step)

# create default model for doctests

default_model = nn.Sequential(OrderedDict([
    ('base', nn.Linear(4, 2)),
    ('fc', nn.Linear(2, 1))
]))

manual_seed(666)

Multiclass case

metric = ClassificationReport(output_dict=True)
metric.attach(default_evaluator, "cr")
y_true = torch.tensor([2, 0, 2, 1, 0, 1])
y_pred = torch.tensor([
    [0.0266, 0.1719, 0.3055],
    [0.6886, 0.3978, 0.8176],
    [0.9230, 0.0197, 0.8395],
    [0.1785, 0.2670, 0.6084],
    [0.8448, 0.7177, 0.7288],
    [0.7748, 0.9542, 0.8573],
])
state = default_evaluator.run([[y_pred, y_true]])
print(state.metrics["cr"].keys())
print(state.metrics["cr"]["0"])
print(state.metrics["cr"]["1"])
print(state.metrics["cr"]["2"])
print(state.metrics["cr"]["macro avg"])

dict_keys(['0', '1', '2', 'macro avg'])
{'precision': 0.5, 'recall': 0.5, 'f1-score': 0.4999...}
{'precision': 1.0, 'recall': 0.5, 'f1-score': 0.6666...}
{'precision': 0.3333..., 'recall': 0.5, 'f1-score': 0.3999...}
{'precision': 0.6111..., 'recall': 0.5, 'f1-score': 0.5222...}

Multilabel case, the shapes must be (batch_size, num_categories, …)

metric = ClassificationReport(output_dict=True, is_multilabel=True)
metric.attach(default_evaluator, "cr")
y_true = torch.tensor([
    [0, 0, 1],
    [0, 0, 0],
    [0, 0, 0],
    [1, 0, 0],
    [0, 1, 1],
])
y_pred = torch.tensor([
    [1, 1, 0],
    [1, 0, 1],
    [1, 0, 0],
    [1, 0, 1],
    [1, 1, 0],
])
state = default_evaluator.run([[y_pred, y_true]])
print(state.metrics["cr"].keys())
print(state.metrics["cr"]["0"])
print(state.metrics["cr"]["1"])
print(state.metrics["cr"]["2"])
print(state.metrics["cr"]["macro avg"])

dict_keys(['0', '1', '2', 'macro avg'])
{'precision': 0.2, 'recall': 1.0, 'f1-score': 0.3333...}
{'precision': 0.5, 'recall': 1.0, 'f1-score': 0.6666...}
{'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0}
{'precision': 0.2333..., 'recall': 0.6666..., 'f1-score': 0.3333...}

Changed in version 0.5.4: added metrics_result_mode argument.

ClassificationReport#

Search Docs