quantificationlib.base module

Base classes for all quantifiers

class BaseQuantifier[source]

Bases: BaseEstimator

Base class for binary, multiclass and ordinal quantifiers

class UsingClassifiers(estimator_train=None, estimator_test=None, needs_predictions_train=True, probabilistic_predictions=True, verbose=0, **kwargs)[source]

Bases: BaseQuantifier

Base class for quantifiers based on the use of classifiers

Classes derived from this abstract class work in two different ways:

  1. Two estimators are used to classify the examples of the training set and the testing set in order to compute the distributions of both sets. Estimators can be already trained.

  2. You can directly provide the predictions for the examples in the fit/predict methods. This is useful for synthetic/artificial experiments.

The idea in both cases is to guarantee that all methods based on classifiers are using exactly the same predictions when you compare this kind of quantifiers. In the first case, estimators are only trained once and can be shared for several quantifiers of this kind.

This class is responsible for fitting the estimators (when needed) and for computing the predictions for the training set and the testing set.

Parameters:
  • estimator_train (estimator object, optional, (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is used to classify the examples of the training set and to obtain their distribution

  • estimator_test (estimator object, optional, (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is used to classify the examples of the testing bag and to obtain their distribution. For some experiments both estimators could be the same

  • needs_predictions_train (bool, (default=True)) – True if the quantifier needs to estimate the training distribution

  • probabilistic_predictions (bool, optional, (default=True)) – Whether the estimators return probabilistic predictions or not. This depends on the specific quantifier, some need crisp predictions (e.g. CC) and other methods require probabilistic predictions (PAC, HDy, …)

  • verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode

estimator_train

Estimator used to classify the examples of the training set

Type:

estimator

estimator_test

Estimator used to classify the examples of the testing bag

Type:

estimator

needs_predictions_train

True if the quantifier needs to estimate the training distribution

Type:

bool

probabilistic_predictions

Whether the estimators return probabilistic predictions or not

Type:

bool

predictions_train_

Predictions of the examples in the training set

Type:

ndarray, shape (n_examples, n_classes) (probabilistic)

predictions_test_

Predictions of the examples in the testing bag

Type:

ndarray, shape (n_examples, n_classes) (probabilistic)

classes_

Class labels

Type:

ndarray, shape (n_classes, )

y_ext_

Repmat of true labels of the training set. When CV_estimator is used with averaged_predictions=False, predictions_train_ will have a larger dimension (factor=n_repetitions * n_folds of the underlying CV_estimator) than y. In other cases, y_ext_ == y. y_ext_ must be used in fit/predict methods whenever the true labels of the training set are needed, instead of y

Type:

ndarray, shape(len(predictions_train_), )

verbose

The verbosity level

Type:

int

Notes

Notice that at least one between estimator_train_/predictions_train and estimator_test_/predictions_test must be not None. If both are None a ValueError exception will be raised. If both are not None, predictions_train/predictions_test are used

fit(X, y, predictions_train=None)[source]

Fits the estimators (estimator_train and estimator_test) and computes the predictions for the training set (predictions_train_ attribute)

First, the method checks that estimator_train and predictions_train are not both None

Then, it fits both estimators if needed. It checks whether the estimators are already trained or not by calling the predict method.

The method finally computes predictions_train_ (if needed, attribute needs_predictions_train) using predictions_train or estimator_train. If predictions_train is not None, predictions_train_ is copied from predictions_train (and converted to crisp values, using __probs2crisps method, when probabilistic_predictions is False). If predictions_train is None, predictions_train_ is computed using the predict/predict_proba method of estimator_train, depending again on the value of probabilistic_predictions attribute.

Parameters:
  • X (array-like, shape (n_examples, n_features)) – Data

  • y (array-like, shape (n_examples, )) – True classes

  • predictions_train (ndarray, optional, shape(n_examples, 1) crisp or shape (n_examples, n_classes) (probs)) – Predictions of the examples in the training set

Raises:

ValueError – When estimator_train and predictions_train are both None

predict(X, predictions_test=None)[source]

Computes the predictions for the testing set (predictions_test_ attribute)

First, the method checks if at least one between estimator_test and prediction_test is not None, otherwise a ValueError exception is raised.

Then, it computes redictions_test_. If predictions_test is not None, predictions_test_ is copied from predictions_test (and converted to crisp values, using __probs2crisp method when probabilistic_predictions attribute is False). If predictions_test is None, predictions_test_ is computed calling the predict/predict_proba method (depending on the value of the attribute probabilistic_predictions) of estimator_test.

Parameters:
  • X ((sparse) array-like, shape (n_examples, n_features)) – Data

  • predictions_test (ndarray, shape (n_examples, n_classes) (default=None)) – Predictions for the testing bag

Raises:

ValueError – When estimator_test and predictions_test are both None

set_fit_request(*, predictions_train='$UNCHANGED$')

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • predictions_train (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions_train parameter in fit.

  • self (UsingClassifiers) –

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, predictions_test='$UNCHANGED$')

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • predictions_test (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions_test parameter in predict.

  • self (UsingClassifiers) –

Returns:

self – The updated object.

Return type:

object

class WithoutClassifiers(verbose=0, **kwargs)[source]

Bases: BaseQuantifier

Base class for quantifiers that do not use any classifier

Examples of this type of quantifiers are HDX and EDX, for instance

Parameters:

verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode

verbose

The verbosity level

Type:

int

classes_

Class labels

Type:

ndarray, shape (n_classes, )

fit(X, y)[source]

This method just checks X and Y and stores the classes of the datasets

Parameters:
  • X (array-like, shape (n_examples, n_features)) – Data

  • y (array-like, shape (n_examples, )) – True classes