rank#
Methods to rank the severity of label issues in multilabel classification datasets. Here each example can belong to one or more classes, or none of the classes at all. Unlike in standard multiclass classification, modelpredicted class probabilities need not sum to 1 for each row in multilabel classification.
Functions:

Computes a label quality score for each example in a multilabel classification dataset. 

Computes a quality score quantifying how likely each individual class annotation is correct in a multilabel classification dataset. 
 cleanlab.multilabel_classification.rank.get_label_quality_scores(labels, pred_probs, *, method='self_confidence', adjust_pred_probs=False, aggregator_kwargs={'alpha': 0.8, 'method': 'exponential_moving_average'})[source]#
Computes a label quality score for each example in a multilabel classification dataset.
Scores are between 0 and 1 with lower scores indicating examples whose label more likely contains an error. For each example, this method internally computes a separate score for each individual class and then aggregates these perclass scores into an overall label quality score for the example.
 Parameters:
labels (
List[List[int]]
) – List of noisy labels for multilabel classification where each example can belong to multiple classes. Refer to documentation for this argument inmultilabel_classification.filter.find_label_issues
for further details.pred_probs (
np.ndarray
) – An array of shape(N, K)
of modelpredicted class probabilities. Refer to documentation for this argument inmultilabel_classification.filter.find_label_issues
for further details.method (
{"self_confidence", "normalized_margin", "confidence_weighted_entropy"}
, default ="self_confidence"
) –Method to calculate separate perclass annotation scores for an example that are then aggregated into an overall label quality score for the example. These scores are separately calculated for each class based on the corresponding column of pred_probs in a onevsrest manner, and are standard label quality scores for binary classification (based on whether the class should or should not apply to this example).
See also
rank.get_label_quality_scores
function for details about each option.adjust_pred_probs (
bool
, default= False
) – Account for class imbalance in the labelquality scoring by adjusting predicted probabilities. Refer to documentation for this argument inrank.get_label_quality_scores
for details.
 aggregator_kwargsdict, default = {“method”: “exponential_moving_average”, “alpha”: 0.8}
A dictionary of hyperparameter values to use when aggregating perclass scores into an overall label quality score for each example. Options for
"method"
include:"exponential_moving_average"
or"softmin"
or your own callable function. Seeinternal.multilabel_scorer.Aggregator
for details about each option and other possible hyperparameters.
To get a score for each class annotation for each example, use the ~cleanlab.multilabel_classification.rank.get_label_quality_scores_per_class method instead.
 Return type:
ndarray
[Any
,dtype
[floating
[TypeVar
(T
, bound=NBitBase
)]]] Returns:
label_quality_scores (
np.ndarray
) – A 1D array of shape(N,)
with a label quality score (between 0 and 1) for each example in the dataset. Lower scores indicate examples whose label is more likely to contain some annotation error (for any of the classes).
Examples
>>> from cleanlab.multilabel_classification import get_label_quality_scores >>> import numpy as np >>> labels = [[1], [0,2]] >>> pred_probs = np.array([[0.1, 0.9, 0.1], [0.4, 0.1, 0.9]]) >>> scores = get_label_quality_scores(labels, pred_probs) >>> scores array([0.9, 0.5])
 cleanlab.multilabel_classification.rank.get_label_quality_scores_per_class(labels, pred_probs, *, method='self_confidence', adjust_pred_probs=False)[source]#
Computes a quality score quantifying how likely each individual class annotation is correct in a multilabel classification dataset. This is similar to ~cleanlab.multilabel_classification.rank.get_label_quality_scores but instead returns the perclass results without aggregation. For a dataset with K classes, each example receives K scores from this method. Refer to documentation in ~cleanlab.multilabel_classification.rank.get_label_quality_scores for details.
 Parameters:
labels (
List[List[int]]
) – List of noisy labels for multilabel classification where each example can belong to multiple classes. Refer to documentation for this argument infind_label_issues
for further details.pred_probs (
np.ndarray
) – An array of shape(N, K)
of modelpredicted class probabilities. Refer to documentation for this argument infind_label_issues
for further details.method (
{"self_confidence", "normalized_margin", "confidence_weighted_entropy"}
, default ="self_confidence"
) – Method to calculate separate perclass annotation scores (that quantify how likely a particular class annotation is correct for a particular example). Refer to documentation for this argument in ~cleanlab.multilabel_classification.rank.get_label_quality_scores for further details.adjust_pred_probs (
bool
, default= False
) – Account for class imbalance in the labelquality scoring by adjusting predicted probabilities. Refer to documentation for this argument inrank.get_label_quality_scores
for details.
 Return type:
ndarray
 Returns:
label_quality_scores (
list(np.ndarray)
) – A list containing K arrays, each of shape (N,). Here K is the number of classes in the dataset and N is the number of examples.label_quality_scores[k][i]
is a score between 0 and 1 quantifying how likely the annotation for classk
is correct for examplei
.
Examples
>>> from cleanlab.multilabel_classification import get_label_quality_scores >>> import numpy as np >>> labels = [[1], [0,2]] >>> pred_probs = np.array([[0.1, 0.9, 0.1], [0.4, 0.1, 0.9]]) >>> scores = get_label_quality_scores(labels, pred_probs) >>> scores array([0.9, 0.5])