quantificationlib.metrics.binary module

Score functions and loss functions for binary quantification problems

absolute_error(p_true, p_pred)[source]

Binary version of the absolute error

Absolute difference between the predicted prevalence (\(\hat{p}\)) and the true prevalence (\(p\))

\(ae = | \hat{p} - p |\)

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

Returns:

absolute error – The absolute error for binary problems

Return type:

float

bias(p_true, p_pred)[source]

Bias of a binary quantifier

It is just the difference between the predicted prevalence (\(\hat{p}\)) and the true prevalence (\(p\))

\(bias = \hat{p} - p\)

It measures whether the binary quantifier tends to overestimate or underestimate the proportion of positives

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

Returns:

bias – The bias for binary problems

Return type:

float

binary_kld(p_true, p_pred, eps=1e-12)[source]

A binary version of the Kullback-Leiber divergence (KLD)

\(kld = p \cdot \log(p/\hat{p}) + (1-p) \cdot \log((1-p)/(1-\hat{p}))\)

Parameters:
  • p_true (array_like, shape = (n_classes)) – True prevalences

  • p_pred (array_like, shape = (n_classes)) – Predicted prevalences.

  • eps (float, (default=1e-12)) – To prevent a division by zero exception

Returns:

KLD – The Kullback-Leiber divergence for binary problems

Return type:

float

normalized_absolute_score(p_true, p_pred)[source]

A score version of the normalized binary absolute error

\(nas = 1 - | \hat{p} - p | / max(p, 1-p)\)

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

Returns:

NAS – The normalized absolute score for binary problems

Return type:

float

normalized_squared_score(p_true, p_pred)[source]

A score version of the normalized binary squared error

\(nss = 1 - ( (\hat{p} - p) / max(p, 1-p) )^2\)

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

Returns:

NSS – The normalized squared score for binary problems

Return type:

float

relative_absolute_error(p_true, p_pred, eps=1e-12)[source]

A binary relative version of the absolute error

It is the relation between the absolute error and the true prevalence.

\(rae = | \hat{p} - p | / p\)

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

  • eps (float, (default=1e-12)) – To prevent a division by zero exception

Returns:

RAE – The relative absolute error for binary problems

Return type:

float

squared_error(p_true, p_pred)[source]

Binary version of the squared error. Only the prevalence of the positive class is used

It is the quadratic difference between the predicted prevalence (\(\hat{p}\)) and the true prevalence (\(p\))

\(se = (\hat{p} - p)^2\)

It penalizes larger errors

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

Returns:

squared_error – The squared error for binary problems

Return type:

float

symmetric_absolute_percentage_error(p_true, p_pred)[source]

A symmetric binary version of RAE

\(sape = | \hat{p} - p | / (\hat{p} + p)\)

Parameters:
  • p_true (float) – True prevalence for the positive class

  • p_pred (float) – Predicted prevalence for the positive class

Returns:

SAPE – The symmetric absolute percentage error for binary problems

Return type:

float