rank#

Methods to rank the severity of label issues in multi-label classification datasets. Here each example can belong to one or more classes, or none of the classes at all. Unlike in standard multi-class classification, model-predicted class probabilities need not sum to 1 for each row in multi-label classification.

Functions:

get_label_quality_scores(labels, pred_probs, *)

Computes a label quality score for each example in a multi-label classification dataset.

get_label_quality_scores_per_class(labels, ...)

Computes a quality score quantifying how likely each individual class annotation is correct in a multi-label classification dataset.

cleanlab.multilabel_classification.rank.get_label_quality_scores(labels, pred_probs, *, method='self_confidence', adjust_pred_probs=False, aggregator_kwargs={'alpha': 0.8, 'method': 'exponential_moving_average'})[source]#

Computes a label quality score for each example in a multi-label classification dataset.

Scores are between 0 and 1 with lower scores indicating examples whose label more likely contains an error. For each example, this method internally computes a separate score for each individual class and then aggregates these per-class scores into an overall label quality score for the example.

Parameters:
  • labels (List[List[int]]) – List of noisy labels for multi-label classification where each example can belong to multiple classes. Refer to documentation for this argument in multilabel_classification.filter.find_label_issues for further details.

  • pred_probs (np.ndarray) – An array of shape (N, K) of model-predicted class probabilities. Refer to documentation for this argument in multilabel_classification.filter.find_label_issues for further details.

  • method ({"self_confidence", "normalized_margin", "confidence_weighted_entropy"}, default = "self_confidence") –

    Method to calculate separate per-class annotation scores for an example that are then aggregated into an overall label quality score for the example. These scores are separately calculated for each class based on the corresponding column of pred_probs in a one-vs-rest manner, and are standard label quality scores for binary classification (based on whether the class should or should not apply to this example).

    See also

    rank.get_label_quality_scores function for details about each option.

  • adjust_pred_probs (bool, default = False) – Account for class imbalance in the label-quality scoring by adjusting predicted probabilities. Refer to documentation for this argument in rank.get_label_quality_scores for details.

aggregator_kwargsdict, default = {“method”: “exponential_moving_average”, “alpha”: 0.8}

A dictionary of hyperparameter values to use when aggregating per-class scores into an overall label quality score for each example. Options for "method" include: "exponential_moving_average" or "softmin" or your own callable function. See internal.multilabel_scorer.Aggregator for details about each option and other possible hyperparameters.

To get a score for each class annotation for each example, use the ~cleanlab.multilabel_classification.rank.get_label_quality_scores_per_class method instead.

Return type:

ndarray[Any, dtype[floating[TypeVar(T, bound= NBitBase)]]]

Returns:

label_quality_scores (np.ndarray) – A 1D array of shape (N,) with a label quality score (between 0 and 1) for each example in the dataset. Lower scores indicate examples whose label is more likely to contain some annotation error (for any of the classes).

Examples

>>> from cleanlab.multilabel_classification import get_label_quality_scores
>>> import numpy as np
>>> labels = [[1], [0,2]]
>>> pred_probs = np.array([[0.1, 0.9, 0.1], [0.4, 0.1, 0.9]])
>>> scores = get_label_quality_scores(labels, pred_probs)
>>> scores
array([0.9, 0.5])
cleanlab.multilabel_classification.rank.get_label_quality_scores_per_class(labels, pred_probs, *, method='self_confidence', adjust_pred_probs=False)[source]#

Computes a quality score quantifying how likely each individual class annotation is correct in a multi-label classification dataset. This is similar to ~cleanlab.multilabel_classification.rank.get_label_quality_scores but instead returns the per-class results without aggregation. For a dataset with K classes, each example receives K scores from this method. Refer to documentation in ~cleanlab.multilabel_classification.rank.get_label_quality_scores for details.

Parameters:
  • labels (List[List[int]]) – List of noisy labels for multi-label classification where each example can belong to multiple classes. Refer to documentation for this argument in find_label_issues for further details.

  • pred_probs (np.ndarray) – An array of shape (N, K) of model-predicted class probabilities. Refer to documentation for this argument in find_label_issues for further details.

  • method ({"self_confidence", "normalized_margin", "confidence_weighted_entropy"}, default = "self_confidence") – Method to calculate separate per-class annotation scores (that quantify how likely a particular class annotation is correct for a particular example). Refer to documentation for this argument in ~cleanlab.multilabel_classification.rank.get_label_quality_scores for further details.

  • adjust_pred_probs (bool, default = False) – Account for class imbalance in the label-quality scoring by adjusting predicted probabilities. Refer to documentation for this argument in rank.get_label_quality_scores for details.

Return type:

ndarray

Returns:

label_quality_scores (list(np.ndarray)) – A list containing K arrays, each of shape (N,). Here K is the number of classes in the dataset and N is the number of examples. label_quality_scores[k][i] is a score between 0 and 1 quantifying how likely the annotation for class k is correct for example i.

Examples

>>> from cleanlab.multilabel_classification import get_label_quality_scores
>>> import numpy as np
>>> labels = [[1], [0,2]]
>>> pred_probs = np.array([[0.1, 0.9, 0.1], [0.4, 0.1, 0.9]])
>>> scores = get_label_quality_scores(labels, pred_probs)
>>> scores
array([0.9, 0.5])