quantificationlib.estimators.ordinal_ddag module¶

Estimator based on DDAGs (Decision Directed Acyclic Graphs)

class DDAGClassifier(estimator, n_jobs=None, predict_method='full_probabilistic', verbose=0)[source]¶

Bases: OneVsOneClassifier

Decision Directed Acyclic Graph ordinal classifier

This strategy consists on learning a classifier per each pair of classes, thus it requires to fit n_classes * (n_classes - 1) / 2 classifiers. For this reason, this class derives from OneVsOneClassifier and uses most of its functionalities, mainly to train the binary models.

However, there are two main differences with respect to OneVsOneClassifier:

This class is used for ordinal classification, it does not make sense to use it for multiclass classification
The rule to make predictions is different. Here, the binary classifiers are arranged into a binary tree in which the classifier selected at each node is the one that deals with the two more distant remaining classes. Thus, the root contains the classifier that decides between the first class of the order and last class, and this idea is recursively applied.

Example:

In a ordinal classification problem with classes ranging from 1-star to 5-star,
the corresponding DDAG will be

                                          1|5

                     1|4                                       2|5

           1|3                 2|4                   2|4                  3|5

      1|2       2|3       2|3       3|4         2|3       3|4        3|4       4|5

    1     2   2     3   2     3   3     4     2     3   3     4    3     4   4     5

Since some subtrees are shared by different branches, for instance, the subtree labeled as node
2|4 is shared by the right subtree of 1|4 and the left subtree of 2|5, the tree can be depicted
in a more compact way:

                                          1|5

                                  1|4             2|5

                           1|3            2|4            3|5

                     1|2          2|3             3|4           4|5

                 1          2              3              4             5

in which all internal nodes (2|4, 2|3 and 3|4) and all leaves, except the first one, and
the last one (2, 3 and 4) are reached from different paths.

The class implements two different strategies to compute the probabilities for a given example:

‘full_probabilistic’

The probabilities computed by each node are propagated through the tree. For those leaves that can be reached following different paths (all except the leaves for the first and the last class), the probabilities are summed. With this method, all the classes may have a probability greater that 0.

Example: For a given example, the probability of the left class returned by each model is the following:

P(1|5) = 0.2
P(1|4) = 0.1
P(2|5) = 0.1
P(1|3) = 0.2
P(2|4) = 0.3
P(3|5) = 0.3
P(1|2) = 0.3
P(2|3) = 0.4
P(3|4) = 0.4
P(4|5) = 0.3

P(y=1) = P(1|5) * P(1|4) * P(1|3) * P(1|2) =
         0.2    * 0.1    * 0.2    * 0.3    = 0.0012

P(y=2) = P(1|5)     * P(1|4)     * P(1|3)     * (1-P(1|2)) +
         P(1|5)     * P(1|4)     * (1-P(1|3)) * P(2|3)     +
         P(1|5)     * (1-P(1|4)) * P(2|4)     * P(2|3)     +
         (1-P(1|5)) * P(2|5)     * P(2|4)     * P(2|3)     =
         0.2        * 0.1        * 0.2        * 0.7        +
         0.2        * 0.1        * 0.8        * 0.4        +
         0.2        * 0.9        * 0.3        * 0.4        +
         0.8        * 0.1        * 0.3        * 0.4        =
         0.0028 + 0.0064 + 0.0216 + 0.0096 = 0.0404

P(y=3) = P(1|5)     * P(1|4)     * (1-P(1|3)) * (1-P(2|3)) +
         P(1|5)     * (1-P(1|4)) * P(2|4))    * (1-P(2|3)) +
         P(1|5)     * (1-P(1|4)) * (1-P(2|4)) * P(3|4)     +
         (1-P(1|5)) * P(2|5)     * P(2|4))    * (1-P(2|3)) +
         (1-P(1|5)) * P(2|5)     * (1-P(2|4)) * P(3|4)     +
         (1-P(1|5)) * (1-P(2|5)) * P(3|5)     * P(3|4)     =
         0.2        * 0.1        * 0.8        * 0.6        +
         0.2        * 0.9        * 0.3        * 0.6        +
         0.2        * 0.9        * 0.7        * 0.4        +
         0.8        * 0.1        * 0.3        * 0.6        +
         0.8        * 0.1        * 0.7        * 0.4        +
         0.8        * 0.9        * 0.3        * 0.4        =
         0.0096 + 0.0324 + 0.0504 + 0.0144 + 0.0224 + 0.0864 = 0.2156

P(y=4) = P(1|5)     * (1-P(1|4)) * (1-P(2|4)) * (1-P(3|4)) +
         (1-P(1|5)) * P(2|5)     * (1-P(2|4)) * (1-P(3|4)) +
         (1-P(1|5)) * (1-P(2|5)) * P(3|5)     * (1-P(3|4)) +
         (1-P(1|5)) * (1-P(2|5)) * (1-P(3|5)) * P(4|5)     =
         0.2        * 0.9        * 0.7        * 0.6        + 0.0756
         0.8        * 0.1        * 0.7        * 0.6        + 0.0336
         0.8        * 0.9        * 0.3        * 0.6        + 0.1296
         0.8        * 0.9        * 0.7        * 0.3        =
         0.0756 + 0.0336 + 0.1296 + 0.1512 = 0.3900

P(y=5) = (1-P(1|5)) * (1-P(2|5)) * (1-P(3|5)) * (1-P(4|5)) =
         0.8        * 0.9        * 0.7        * 0.7        = 0.3528

Thus, the probabilities returned by `predict_proba` method would be:
        (0.0012, 0.0404, 02156, 0.3900, 0.3528)

‘winner_node’

Uses the probabilities of binary estimators to descent until the level previous to the leaves (it is like binarizing such probabilities to 0,1). Then, the method returns the probalities of such binary estimator for the two consecutive classes involved, and zero for the rest of classes.

Example: For a given example, the probability of the left class returned by each model is the following:

P(1|5) = 0.2
P(1|4) = 0.1
P(2|5) = 0.1
P(1|3) = 0.2
P(2|4) = 0.3
P(3|5) = 0.3
P(1|2) = 0.3
P(2|3) = 0.4
P(3|4) = 0.4
P(4|5) = 0.3

Taking binary decisions from the root, we reach the binary classifier 4|5, thus
the returned probabilities are (0, 0, 0, 0.3, 0.7)

Parameters:

estimator (estimator object (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is the base estimator used to learn the set of binary classifiers
n_jobs (int or None, optional (default=None)) – The number of jobs to use for the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors
predict_method (str, optional (default 'full_probabilistic')) –
- ‘full_probabilistic’
The probabilities computed by each node are propagated through the tree. For those leaves that can be reached following different paths (all except the leaves for the first and the last class), the probabilities are summed. With this method, all the classes may have a probability greater that 0.
- ’winner_node’
Uses the probabilities of binary estimators to descent until the level previous to the leaves (it is like binarizing such probabilities to 0,1). Then, the method returns the probalities of such binary estimator for the two consecutive classes involved, and zero for the rest of classes.
verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode

estimator¶

An estimator object implementing fit and predict_proba.

Type:: estimator object

n_jobs¶

The number of jobs to use for the computation.

Type:: int or None,

predict_method¶

The method used by predict_proba to compute the class probabilities of a given example

Type:: str

verbose¶

The verbosity level.

Type:: int,

estimators_¶

Estimators used for predictions.

Type:: list of n_classes * (n_classes - 1) / 2 estimators

classes_¶

Array containing labels.

Type:: numpy array of shape [n_classes]

References

José Ramón Quevedo, Elena Montañés, Óscar Luaces, Juan José del Coz: Adapting decision DAGs for multipartite ranking. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 115-130). Springer, Berlin, Heidelberg.

predict(X)[source]¶

Predict the class for each testing example

The method computes the probability of each class (using predict_proba) and returns the class with highest probability

Parameters:: X ((sparse) array-like, shape (n_examples, n_features)) – Data
Return type:: An array, shape(n_examples, ) with the predicted class for each example
Raises:: NotFittedError – When the estimators are not fitted yet

predict_proba(X)[source]¶

Predict the class probabilities for each example

Two different methods are implemented depending on the value of predict_method attribute

‘full_probabilistic’: The probabilities computed by each node are propagated through the tree. For those leaves that can be reached following different paths (all except the leaves for the first and the last class), the probabilities are summed. With this method, all the classes may have a probability greater that 0.
‘winner_node’: Uses the probabilities of binary estimators to descent until the level previous to the leaves (it is like binarizing such probabilities to 0,1). Then, the method returns the probalities of such binary estimator for the two consecutive classes involved, and zero for the rest of classes.

The method uses a recursive auxiliar method to compute the class probabilities

Parameters:: X ((sparse) array-like, shape (n_examples, n_features)) – Data
Return type:: An array, shape(n_examples, n_classes) with the class probabilities for each example
Raises:: NotFittedError – When the estimators are not fitted yet

set_partial_fit_request(*, classes='$UNCHANGED$')¶

Request metadata passed to the partial_fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to partial_fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to partial_fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for classes parameter in partial_fit.
self (DDAGClassifier) –

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight='$UNCHANGED$')¶

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (DDAGClassifier) –

Returns:

self – The updated object.

Return type:

object

quantificationlib.estimators.ordinal_ddag module¶

Table of Contents

Source Code

Previous topic

Next topic