quantificationlib.estimators.ordinal_ddag module¶
Estimator based on DDAGs (Decision Directed Acyclic Graphs)
- class DDAGClassifier(estimator, n_jobs=None, predict_method='full_probabilistic', verbose=0)[source]¶
Bases:
OneVsOneClassifier
Decision Directed Acyclic Graph ordinal classifier
This strategy consists on learning a classifier per each pair of classes, thus it requires to fit n_classes * (n_classes - 1) / 2 classifiers. For this reason, this class derives from OneVsOneClassifier and uses most of its functionalities, mainly to train the binary models.
However, there are two main differences with respect to OneVsOneClassifier:
This class is used for ordinal classification, it does not make sense to use it for multiclass classification
The rule to make predictions is different. Here, the binary classifiers are arranged into a binary tree in which the classifier selected at each node is the one that deals with the two more distant remaining classes. Thus, the root contains the classifier that decides between the first class of the order and last class, and this idea is recursively applied.
Example:
In a ordinal classification problem with classes ranging from 1-star to 5-star, the corresponding DDAG will be 1|5 1|4 2|5 1|3 2|4 2|4 3|5 1|2 2|3 2|3 3|4 2|3 3|4 3|4 4|5 1 2 2 3 2 3 3 4 2 3 3 4 3 4 4 5 Since some subtrees are shared by different branches, for instance, the subtree labeled as node 2|4 is shared by the right subtree of 1|4 and the left subtree of 2|5, the tree can be depicted in a more compact way: 1|5 1|4 2|5 1|3 2|4 3|5 1|2 2|3 3|4 4|5 1 2 3 4 5 in which all internal nodes (2|4, 2|3 and 3|4) and all leaves, except the first one, and the last one (2, 3 and 4) are reached from different paths.
The class implements two different strategies to compute the probabilities for a given example:
- ‘full_probabilistic’
The probabilities computed by each node are propagated through the tree. For those leaves that can be reached following different paths (all except the leaves for the first and the last class), the probabilities are summed. With this method, all the classes may have a probability greater that 0.
Example: For a given example, the probability of the left class returned by each model is the following:
P(1|5) = 0.2 P(1|4) = 0.1 P(2|5) = 0.1 P(1|3) = 0.2 P(2|4) = 0.3 P(3|5) = 0.3 P(1|2) = 0.3 P(2|3) = 0.4 P(3|4) = 0.4 P(4|5) = 0.3 P(y=1) = P(1|5) * P(1|4) * P(1|3) * P(1|2) = 0.2 * 0.1 * 0.2 * 0.3 = 0.0012 P(y=2) = P(1|5) * P(1|4) * P(1|3) * (1-P(1|2)) + P(1|5) * P(1|4) * (1-P(1|3)) * P(2|3) + P(1|5) * (1-P(1|4)) * P(2|4) * P(2|3) + (1-P(1|5)) * P(2|5) * P(2|4) * P(2|3) = 0.2 * 0.1 * 0.2 * 0.7 + 0.2 * 0.1 * 0.8 * 0.4 + 0.2 * 0.9 * 0.3 * 0.4 + 0.8 * 0.1 * 0.3 * 0.4 = 0.0028 + 0.0064 + 0.0216 + 0.0096 = 0.0404 P(y=3) = P(1|5) * P(1|4) * (1-P(1|3)) * (1-P(2|3)) + P(1|5) * (1-P(1|4)) * P(2|4)) * (1-P(2|3)) + P(1|5) * (1-P(1|4)) * (1-P(2|4)) * P(3|4) + (1-P(1|5)) * P(2|5) * P(2|4)) * (1-P(2|3)) + (1-P(1|5)) * P(2|5) * (1-P(2|4)) * P(3|4) + (1-P(1|5)) * (1-P(2|5)) * P(3|5) * P(3|4) = 0.2 * 0.1 * 0.8 * 0.6 + 0.2 * 0.9 * 0.3 * 0.6 + 0.2 * 0.9 * 0.7 * 0.4 + 0.8 * 0.1 * 0.3 * 0.6 + 0.8 * 0.1 * 0.7 * 0.4 + 0.8 * 0.9 * 0.3 * 0.4 = 0.0096 + 0.0324 + 0.0504 + 0.0144 + 0.0224 + 0.0864 = 0.2156 P(y=4) = P(1|5) * (1-P(1|4)) * (1-P(2|4)) * (1-P(3|4)) + (1-P(1|5)) * P(2|5) * (1-P(2|4)) * (1-P(3|4)) + (1-P(1|5)) * (1-P(2|5)) * P(3|5) * (1-P(3|4)) + (1-P(1|5)) * (1-P(2|5)) * (1-P(3|5)) * P(4|5) = 0.2 * 0.9 * 0.7 * 0.6 + 0.0756 0.8 * 0.1 * 0.7 * 0.6 + 0.0336 0.8 * 0.9 * 0.3 * 0.6 + 0.1296 0.8 * 0.9 * 0.7 * 0.3 = 0.0756 + 0.0336 + 0.1296 + 0.1512 = 0.3900 P(y=5) = (1-P(1|5)) * (1-P(2|5)) * (1-P(3|5)) * (1-P(4|5)) = 0.8 * 0.9 * 0.7 * 0.7 = 0.3528 Thus, the probabilities returned by `predict_proba` method would be: (0.0012, 0.0404, 02156, 0.3900, 0.3528)
- ‘winner_node’
Uses the probabilities of binary estimators to descent until the level previous to the leaves (it is like binarizing such probabilities to 0,1). Then, the method returns the probalities of such binary estimator for the two consecutive classes involved, and zero for the rest of classes.
Example: For a given example, the probability of the left class returned by each model is the following:
P(1|5) = 0.2 P(1|4) = 0.1 P(2|5) = 0.1 P(1|3) = 0.2 P(2|4) = 0.3 P(3|5) = 0.3 P(1|2) = 0.3 P(2|3) = 0.4 P(3|4) = 0.4 P(4|5) = 0.3 Taking binary decisions from the root, we reach the binary classifier 4|5, thus the returned probabilities are (0, 0, 0, 0.3, 0.7)
- Parameters:
estimator (estimator object (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is the base estimator used to learn the set of binary classifiers
n_jobs (int or None, optional (default=None)) – The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processorspredict_method (str, optional (default 'full_probabilistic')) –
‘full_probabilistic’
The probabilities computed by each node are propagated through the tree. For those leaves that can be reached following different paths (all except the leaves for the first and the last class), the probabilities are summed. With this method, all the classes may have a probability greater that 0.
’winner_node’
Uses the probabilities of binary estimators to descent until the level previous to the leaves (it is like binarizing such probabilities to 0,1). Then, the method returns the probalities of such binary estimator for the two consecutive classes involved, and zero for the rest of classes.
verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode
- estimator¶
An estimator object implementing fit and predict_proba.
- Type:
estimator object
- n_jobs¶
The number of jobs to use for the computation.
- Type:
int or None,
- predict_method¶
The method used by predict_proba to compute the class probabilities of a given example
- Type:
str
- verbose¶
The verbosity level.
- Type:
int,
- estimators_¶
Estimators used for predictions.
- Type:
list of
n_classes * (n_classes - 1) / 2
estimators
- classes_¶
Array containing labels.
- Type:
numpy array of shape [n_classes]
References
José Ramón Quevedo, Elena Montañés, Óscar Luaces, Juan José del Coz: Adapting decision DAGs for multipartite ranking. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 115-130). Springer, Berlin, Heidelberg.
- predict(X)[source]¶
Predict the class for each testing example
The method computes the probability of each class (using predict_proba) and returns the class with highest probability
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, ) with the predicted class for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- predict_proba(X)[source]¶
Predict the class probabilities for each example
Two different methods are implemented depending on the value of predict_method attribute
- ‘full_probabilistic’
The probabilities computed by each node are propagated through the tree. For those leaves that can be reached following different paths (all except the leaves for the first and the last class), the probabilities are summed. With this method, all the classes may have a probability greater that 0.
- ‘winner_node’
Uses the probabilities of binary estimators to descent until the level previous to the leaves (it is like binarizing such probabilities to 0,1). Then, the method returns the probalities of such binary estimator for the two consecutive classes involved, and zero for the rest of classes.
The method uses a recursive auxiliar method to compute the class probabilities
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, n_classes) with the class probabilities for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- set_partial_fit_request(*, classes='$UNCHANGED$')¶
Request metadata passed to the
partial_fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topartial_fit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topartial_fit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
classes (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
classes
parameter inpartial_fit
.self (DDAGClassifier) –
- Returns:
self – The updated object.
- Return type:
object
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.self (DDAGClassifier) –
- Returns:
self – The updated object.
- Return type:
object