rank#

Methods to rank examples in standard (multi-class) classification datasets by cleanlab’s label quality score. Except for order_label_issues, which operates only on the subset of the data identified as potential label issues/errors, the methods in this module can be used on whichever subset of the dataset you choose (including the entire dataset) and provide a label quality score for every example. You can then do something like: np.argsort(label_quality_score) to obtain ranked indices of individual datapoints based on their quality.

Note: multi-label classification is not supported by most methods in this module, each example must be labeled as belonging to a single class, e.g. format: labels = np.ndarray([1,0,2,1,1,0...]). For multi-label classification, instead see multilabel_classification.get_label_quality_scores.

Note: Label quality scores are most accurate when they are computed based on out-of-sample pred_probs from your model. To obtain out-of-sample predicted probabilities for every datapoint in your dataset, you can use cross-validation. This is encouraged to get better results.

Functions:

`get_label_quality_scores`(labels, pred_probs, *)	Returns a label quality score for each datapoint.
`get_label_quality_ensemble_scores`(labels, ...)	Returns label quality scores based on predictions from an ensemble of models.
`find_top_issues`(quality_scores, *[, top])	Returns the sorted indices of the `top` issues in `quality_scores`, ordered from smallest to largest quality score (i.e., from most to least likely to be an issue).
`order_label_issues`(label_issues_mask, ...[, ...])	Sorts label issues by label quality score.
`get_self_confidence_for_each_label`(labels, ...)	Returns the self-confidence label-quality score for each datapoint.
`get_normalized_margin_for_each_label`(labels, ...)	Returns the "normalized margin" label-quality score for each datapoint.
`get_confidence_weighted_entropy_for_each_label`(...)	Returns the "confidence weighted entropy" label-quality score for each datapoint.

cleanlab.rank.get_label_quality_scores(labels, pred_probs, *, method='self_confidence', adjust_pred_probs=False)[source]#

Returns a label quality score for each datapoint.

This is a function to compute label quality scores for standard (multi-class) classification datasets, where lower scores indicate labels less likely to be correct.

Score is between 0 and 1.

1 - clean label (given label is likely correct). 0 - dirty label (given label is likely incorrect).

Parameters:

labels (np.ndarray) – A discrete vector of noisy labels, i.e. some labels may be erroneous. Format requirements: for dataset with K classes, labels must be in 0, 1, …, K-1. Note: multi-label classification is not supported by this method, each example must belong to a single class, e.g. format: labels = np.ndarray([1,0,2,1,1,0...]).
pred_probs (np.ndarray, optional) –
An array of shape (N, K) of model-predicted probabilities, P(label=k|x). Each row of this matrix corresponds to an example x and contains the model-predicted probabilities that x belongs to each possible class, for each of the K classes. The columns must be ordered such that these probabilities correspond to class 0, 1, …, K-1.

Note: Returned label issues are most accurate when they are computed based on out-of-sample pred_probs from your model. To obtain out-of-sample predicted probabilities for every datapoint in your dataset, you can use cross-validation. This is encouraged to get better results.
method ({"self_confidence", "normalized_margin", "confidence_weighted_entropy"}, default "self_confidence") –
Label quality scoring method.

Letting k = labels[i] and P = pred_probs[i] denote the given label and predicted class-probabilities for datapoint i, its score can either be:
- 'normalized_margin': P[k] - max_{k' != k}[ P[k'] ]
- 'self_confidence': P[k]
- 'confidence_weighted_entropy': entropy(P) / self_confidence
Let C = {0, 1, ..., K-1} denote the specified set of classes for our classification task.

The normalized_margin score works better for identifying class conditional label errors, i.e. examples for which another label in C is appropriate but the given label is not.

The self_confidence score works better for identifying alternative label issues corresponding to bad examples that are: not from any of the classes in C, well-described by 2 or more labels in C, or generally just out-of-distribution (ie. anomalous outliers).
adjust_pred_probs (bool, optional) – Account for class imbalance in the label-quality scoring by adjusting predicted probabilities via subtraction of class confident thresholds and renormalization. Set this to True if you prefer to account for class-imbalance. See Northcutt et al., 2021.

Return type:

ndarray

Returns:

label_quality_scores (np.ndarray) – Contains one score (between 0 and 1) per example. Lower scores indicate more likely mislabeled examples.