quantificationlib.estimators.frank_and_hall module¶
Estimators based on Frank and Hall decomposition
- class FHLabelBinarizer(neg_label=0, pos_label=1)[source]¶
Bases:
LabelBinarizer
Binarize labels in a Frank and Hall decomposition
This type of decomposition works as follows. For instance, in a ordinal classification problem with classes ranging from 1-star to 5-star, Frank and Hall (FH) decompositon trains 4 binary classifiers: 1 vs 2-3-4-5, 1-2 vs 3-4-5, 1-2-3 vs 4-5, 1-2-3-4 vs 5 and combines their predictions.
To train all these binary classifiers, one needs to convert the original ordinal labels to binary labels for each of the binary problems of the Frank and Hall decomposition. FHLabelBinarizer makes this process easy using the transform method.
- Parameters:
neg_label (int (default: 0)) – Value with which negative labels must be encoded.
pos_label (int (default: 1)) – Value with which positive labels must be encoded.
sparse_output (boolean (default: False)) – True if the returned array from transform is desired to be in sparse CSR format.
- set_inverse_transform_request(*, threshold='$UNCHANGED$')¶
Request metadata passed to the
inverse_transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toinverse_transform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toinverse_transform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
threshold (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
threshold
parameter ininverse_transform
.self (FHLabelBinarizer) –
- Returns:
self – The updated object.
- Return type:
object
- transform(y)[source]¶
Transform ordinal labels to the Frank and Hall binary labels
- Parameters:
y (array, (n_samples,)) – Class labels for a set of examples
- Returns:
y_bin_fh – Each column contains the binary labels for the consecutive binary problems of a Frank and Hall decomposition from left to right. For instance, in a 4-class problem, each column corresponds to the following problems:
1st column: 1 vs 2-3-4
2nd column: 1-2 vs 3-4
3rd column: 1-2-3 vs 4
4ht column: (not really used)
- Return type:
array, (n_samples, n_classes)
- class FrankAndHallClassifier(estimator=None, n_jobs=None, verbose=0, params_fit=None)[source]¶
Bases:
BaseEstimator
,ClassifierMixin
Ordinal Classifier following Frank and Hall binary decomposition
This type of decomposition works as follows. For instance, in a ordinal classification problem with classes ranging from 1-star to 5-star, Frank and Hall (FH) decompositon trains 4 binary classifiers: 1 vs 2-3-4-5, 1-2 vs 3-4-5, 1-2-3 vs 4-5, 1-2-3-4 vs 5 and combines their predictions.
- Parameters:
estimator (estimator object (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is the base estimator used to learn the set of binary classifiers
n_jobs (int or None, optional (default=None)) – The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processorsparams_fit (list of dictionaries with parameters for each binary estimator, optional) –
Example: 5 classes/4 binary estimators:
params_fit = [{‘C’:0.0001} , {‘C’:0.000001}, {‘C’:0.000001}, {‘C’:0.01}]
verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode
- estimator¶
The base estimator used to build the FH decomposition
- Type:
estimator object
- n_jobs¶
The number of jobs to use for the computation.
- Type:
int or None,
- params_fit¶
It has the parameters for each binary estimator
- Type:
list of dictionaries
- verbose¶
The verbosity level. The default value, zero, means silent mode
- Type:
int
- classes_¶
Class labels
- Type:
ndarray, shape (n_classes, )
- estimators_¶
- List of binary estimators following the same order of the Frank and Hall decomposition:
estimators_[0] -> 1 vs 2-3-4-5, estimators_[1] -> 1-2 vs 3-4-5, …
- Type:
ndarray, shape(n_classes-1,)
- label_binarizer_¶
Object used to transform multiclass labels to binary labels and vice-versa
- Type:
FHLabelBinarizer object
References
Eibe Frank and Mark Hall. 2001. A simple approach to ordinal classification. In Proceedings of the European Conference on Machine Learning. Springer, 145156.
- fit(X, y)[source]¶
Fits the set of estimators for the training set following the Frank and Hall decomposition
- It learns a list of binary estimators following the same order of the Frank and Hall decomposition:
estimators_[0] -> 1 vs 2-3-4-5, estimators_[1] -> 1-2 vs 3-4-5, …
The left group of each classifier ({1}, {1,2}, …) is the positive class
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
y ((sparse) array-like, shape (n_examples, )) – True classes
- Raises:
ValueError – When estimator is None
- predict(X)[source]¶
Predict the class for each testing example
The method computes the probability of each class (using predict_proba) and returns the class with highest probability
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, ) with the predicted class for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- predict_proba(X)[source]¶
Predict the class probabilities for each example following the original rule proposed by Frank & Hall
If the classes are c_1 to c_k:
Pr(y = c_1) = Pr (y <= c_1) Pr(y = c_i) = Pr(y > c_i−1) x (1 − Pr(y > c_i)) ; 1 < i < k Pr(y = c_k) = Pr(y > c_k−1) Notice that : sum_{i=1}^{i=k} Pr(c_i) \neq 1
Example with 5 classes:
We have 4 binary estimators that return two probabilities: the probability of the left group and the probability of the right group, denoted as e_i.left and e_i.right respectively, in which i is the number of the estimator 1<=i<k Estimator 0: c1 | c2, c3, c4, c5 e1.left | e1.right Estimator 2: c1, c2 | c3, c4, c5 e2.left | e2.right Estimator 3: c1, c2, c3 | c4, c5 e3.left | e3.right Estimator 4: c1, c2, c3 c4 | c5 e4.left | e4.right Pr(y = c_1) = e1.left Pr(y = c_2) = e1.right x e2.left Pr(y = c_3) = e2.right x e3.left Pr(y = c_4) = e3.right x e4.left Pr(y = c_5) = e4.right
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, n_classes) with the class probabilities for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.self (FrankAndHallClassifier) –
- Returns:
self – The updated object.
- Return type:
object
- class FrankAndHallMonotoneClassifier(estimator=None, n_jobs=None, verbose=0, params_fit=None)[source]¶
Bases:
FrankAndHallClassifier
Ordinal Classifier following Frank and Hall binary decomposition but returning consistent probabilities
This type of decomposition works as follows. For instance, in a ordinal classification problem with classes ranging from 1-star to 5-star, Frank and Hall (FH) decompositon trains 4 binary classifiers: 1 vs 2-3-4-5, 1-2 vs 3-4-5, 1-2-3 vs 4-5, 1-2-3-4 vs 5 and combines their predictions.
The difference with FrankAndHallClassifier is that the original method devised by Frank & Hall was intented just for crips predictions. The computed probabilities for all classes may be not consistent (their sum is not 1 in many cases)
Following (Destercke, Yang, 2014) this class computes the upper (adjusting from left to right) and the lower (from right to left) cumulative probabilities for each group of classes. These sets of values are monotonically increasing (from left to right) and monotonically decreasing (from right to left), respectively. The final probability assigned to each group is the average of both values, and the probality of each class is computed as:
Pr({y_k}) = Pr({y_1,…,y_k}) - Pr({y_1,…,y_k-1})
- Parameters:
estimator (estimator object (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is the base estimator used to learn the set of binary classifiers
n_jobs (int or None, optional (default=None)) – The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processorsparams_fit (list of dictionaries with parameters for each binary estimator, optional) –
Example: 5 classes/4 binary estimators:
params_fit = [{‘C’:0.0001} , {‘C’:0.000001}, {‘C’:0.000001}, {‘C’:0.01}]
verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode
- estimator¶
The base estimator used to build the FH decomposition
- Type:
estimator object
- n_jobs¶
The number of jobs to use for the computation.
- Type:
int or None,
- verbose¶
The verbosity level. The default value, zero, means silent mode
- Type:
int
- params_fit¶
It has the parameters for each binary estimator (not used in this class)
- Type:
list of dictionaries
- classes_¶
Class labels
- Type:
ndarray, shape (n_classes, )
- estimators_¶
- List of binary estimators following the same order of the Frank and Hall decomposition:
estimators_[0] -> 1 vs 2-3-4-5, estimators_[1] -> 1-2 vs 3-4-5, …
- Type:
ndarray, shape(n_classes-1,)
- label_binarizer_¶
Object used to transform multiclass labels to binary labels and vice-versa
- Type:
FHLabelBinarizer object
References
Sébastien Destercke, Gen Yang. Cautious Ordinal Classification by Binary Decomposition. Machine Learning and Knowledge Discovery in Databases - European Conference ECML/PKDD, Sep 2014, Nancy, France. pp.323 - 337, 2014,
- fit(X, y)[source]¶
Fits the set of estimators for the training set following the Frank and Hall decomposition
- It learns a list of binary estimators following the same order of the Frank and Hall decomposition:
estimators_[0] -> 1 vs 2-3-4-5, estimators_[1] -> 1-2 vs 3-4-5, …
The left group of each classifier ({1}, {1,2}, …) is the positive class
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
y ((sparse) array-like, shape (n_examples, )) – True classes
- Raises:
ValueError – When estimator is None
- predict(X)[source]¶
Predict the class for each testing example
The method computes the probability of each class (using predict_proba) and returns the class with highest probability
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, ) with the predicted class for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- predict_proba(X)[source]¶
Predict the class probabilities for each example following a new rule (different from the original one proposed by Frank & Hall)
To obtain consistent probabilities, we need to ensure that the aggregated consecutive probabilities do not decrease.
Example:
Classifier 1 vs 2-3-4 Pr({1}) = 0.3 Classifier 1-2 vs 3-4 Pr({1,2}) = 0.2 Classifier 1-2-3 vs 4 Pr({1,2,3}) = 0.6
This is inconsistent. Following (Destercke and Yang, 2014) the method computes the upper (adjusting from left to right) and the lower (from right to left) cumulative probabilities. These sets of values are monotonically increasing (from left to right) and monotonically decreasing (from right to left), respectively. The average value is assigned to each group and the probability for each class is computed as:
Pr({y_k}) = Pr({y_1,…,y_k}) - Pr({y_1,…,y_k-1})
Example:
{1} {1-2} {1-2-3} 0.3 0.3 0.6 Upper cumulative probabilities (adjusting from left to right) 0.2 0.2 0.6 Lower cumulative probabilities (adjusting from right to left) ---------------- 0.25 0.25 0.6 Averaged probability Pr({1}) = 0.25 Pr({2}) = Pr({1,2}) - Pr({1}) = 0.25 - 0 .25 = 0 Pr({3}) = Pr({1,2,3}} - Pr({1,2}) = 0.6 - 0.25 = 0.35 The last class is computed as 1 - the sum of the probabilities for the rest of classes Pr({4}) = 1 - Pr({1,2,3}} = 1 - 0.6 = 0.4
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, n_classes) with the class probabilities for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.self (FrankAndHallMonotoneClassifier) –
- Returns:
self – The updated object.
- Return type:
object
- class FrankAndHallTreeClassifier(estimator=None, n_jobs=None, verbose=0, performance_measure=<function binary_kld>, params_fit=None)[source]¶
Bases:
FrankAndHallClassifier
Ordinal Classifier following Frank and Hall binary decomposition but organizing the binary models in a tree to compute the predictions
This type of decomposition works as follows. For instance, in a ordinal classification problem with classes ranging from 1-star to 5-star, Frank and Hall (FH) decompositon trains 4 binary classifiers: 1 vs 2-3-4-5, 1-2 vs 3-4-5, 1-2-3 vs 4-5, 1-2-3-4 vs 5 and combines their predictions.
The difference with FrankAndHallClassifier is that the original method devised by Frank & Hall computes the probability of each class applying the binary models from left to right: 1 vs 2-3-4-5, 1-2 vs 3-4-5, and so on. This classifier is based on the method proposed by (San Martino, Gao and Sebastiani, 2016). The idea is to build a binary tree with the binary models of the Frank and Hall decomposition, selecting at each point of the tree the best possible model according to their quantification performance (applying PCC algorithm with each binary classifier and using the KLD as performance measure).
Example:
1-2-3 vs 4-5 1 vs 2-3-4-5 1-2-3-4 vs 5 1 1-2 vs 3-4-5 4 5 2 3
- Parameters:
estimator (estimator object (default=None)) – An estimator object implementing fit and one of predict or predict_proba. It is the base estimator used to learn the set of binary classifiers
n_jobs (int or None, optional (default=None)) – The number of jobs to use for the computation.
None
means 1 unless in ajoblib.parallel_backend
context.-1
means using all processorsperformance_measure (a binary quantification performance measure, (default=binary_kld)) – The binary quantification performance measure used to estimate the goodness of each binary classifier used as quantifier
params_fit (list of dictionaries with parameters for each binary estimator, optional) –
Example: 5 classes/4 binary estimators:
params_fit = [{‘C’:0.0001} , {‘C’:0.000001}, {‘C’:0.000001}, {‘C’:0.01}]
verbose (int, optional, (default=0)) – The verbosity level. The default value, zero, means silent mode
- estimator¶
The base estimator used to build the FH decomposition
- Type:
estimator object
- n_jobs¶
The number of jobs to use for the computation.
- Type:
int or None,
- performance_measure¶
The binary quantification performance measure used to estimate the goodness of each binary classifier used as quantifier
- Type:
str, or any binary quantification performance measure
- verbose¶
The verbosity level. The default value, zero, means silent mode
- Type:
int
- params_fit¶
It has the parameters for each binary estimator
- Type:
list of dictionaries
- classes_¶
Class labels
- Type:
ndarray, shape (n_classes, )
- estimators_¶
- List of binary estimators following the same order of the Frank and Hall decomposition:
estimators_[0] -> 1 vs 2-3-4-5, estimators_[1] -> 1-2 vs 3-4-5, …
- Type:
ndarray, shape(n_classes-1,)
- label_binarizer_¶
Object used to transform multiclass labels to binary labels and vice-versa
- Type:
FHLabelBinarizer object
- tree_¶
A tree with the binary classifiers ordered by their quantification performance (using KLD or other measure)
- Type:
A tree
References
Giovanni Da San Martino, Wei Gao, and Fabrizio Sebastiani. 2016a. Ordinal text quantification. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. 937940.
Giovanni Da San Martino,Wei Gao, and Fabrizio Sebastiani. 2016b. QCRI at SemEval-2016 Task 4: Probabilistic methods for binary and ordinal quantification. In Proceedings of the 10th InternationalWorkshop on Semantic Evaluation (SemEval’16). Association for Computational Linguistics, A, 5863.
- fit(X, y)[source]¶
Fits the set of estimators for the training set following the Frank and Hall decomposition and builds the binary tree to organize such estimators
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
y ((sparse) array-like, shape (n_examples, )) – True classes
- Raises:
ValueError – When estimator is None
- predict(X)[source]¶
Predict the class for each testing example
The method computes the probability of each class (using predict_proba) and returns the class with highest probability
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, ) with the predicted class for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- predict_proba(X)[source]¶
Predict the class probabilities for each example applying the binary tree of models
Example:
1-2-3 vs 4-5 1 vs 2-3-4-5 1-2-3-4 vs 5 1 1-2 vs 3-4-5 4 5 2 3 Imagine that for a given example the probabily returned by each model are the following (the models return the probability of the left group of classes): Pr({1,2,3}) = 0.2 Pr({1}) = 0.9 Pr({1,2,3,4}) = 0.7 Pr({1,2}) = 0.4 with tha values, the probability for each class will be: Pr({1}) = Pr({1,2,3}) * Pr({1}) = 0.2 * 0.9 = 0.18 Pr({2}) = Pr({1,2,3}) * (1-Pr({1})) * Pr({1,2}) = 0.2 * 0.1 * 0.4 = 0.008 Pr({3}) = Pr({1,2,3}) * (1-Pr({1})) * (1-Pr({1,2})) = 0.2 * 0.1 * 0.6 = 0.012 Pr({4}) = (1-Pr({1,2,3}) * Pr{1,2,3,4}) = 0.8 * 0.7 = 0.56 Pr({5}) = (1-Pr({1,2,3}) * (1-Pr{1,2,3,4})) = 0.8 * 0.3 = 0.24
- Parameters:
X ((sparse) array-like, shape (n_examples, n_features)) – Data
- Return type:
An array, shape(n_examples, n_classes) with the class probabilities for each example
- Raises:
NotFittedError – When the estimators are not fitted yet
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weight
parameter inscore
.self (FrankAndHallTreeClassifier) –
- Returns:
self – The updated object.
- Return type:
object
- class QTree(fhtree=None, pos_estimator=0, left=None, right=None)[source]¶
Bases:
object
Auxiliar class to represent the binary trees needed by FrankAndHallTreeClassifier
- Parameters:
fhqtree (FrankAndHallTreeClassifier object (default=None)) –
pos_estimator (int, (default=0)) – Index of the estimator in the order defined by the Frank and Hall decomposition: 1 vs 2-3-4-5, 1-2 vs 3-4-5 and so on.
left (a QTree object (default=None)) – Left subTree of this node
right (a QTree object (default=None)) – Right subTree of this node