rank#

Methods to rank/order data by cleanlab’s label quality score. Except for order_label_issues, which operates only on the subset of the data identified as potential label issues/errors, the methods in this module can be used on whichever subset of the dataset you choose (including the entire dataset) and provide a label quality score for every example. You can then do something like: np.argsort(label_quality_score) to obtain ranked indices of individual data.

CAUTION: These label quality scores are computed based on pred_probs from your model that must be out-of-sample! You should never provide predictions on the same examples used to train the model, as these will be overfit and unsuitable for finding label-errors. To obtain out-of-sample predicted probabilities for every datapoint in your dataset, you can use cross-validation. Alternatively it is ok if your model was trained on a separate dataset and you are only evaluating labels in data that was previously held-out.

Functions:

`get_confidence_weighted_entropy_for_each_label`(...)	Returns the "confidence weighted entropy" label-quality score for each datapoint.
`get_label_quality_ensemble_scores`(labels, ...)	Returns label quality scores based on predictions from an ensemble of models.
`get_label_quality_scores`(labels, pred_probs, *)	Returns label quality scores for each datapoint.
`get_normalized_margin_for_each_label`(labels, ...)	Returns the "normalized margin" label-quality score for each datapoint.
`get_self_confidence_for_each_label`(labels, ...)	Returns the self-confidence label-quality score for each datapoint.
`order_label_issues`(label_issues_mask, ...[, ...])	Sorts label issues by label quality score.

cleanlab.rank.get_confidence_weighted_entropy_for_each_label(labels: numpy.array, pred_probs: numpy.array) → numpy.array[source]#

Returns the “confidence weighted entropy” label-quality score for each datapoint.

This is a function to compute label-quality scores for classification datasets, where lower scores indicate labels less likely to be correct.

“confidence weighted entropy” is the normalized entropy divided by “self-confidence”.

Parameters

labels (np.array) – Labels in the same format expected by the get_label_quality_scores function.
pred_probs (np.array) – Predicted-probabilities in the same format expected by the get_label_quality_scores function.

Returns

label_quality_scores – An array of scores (between 0 and 1) for each example of its likelihood of being correctly labeled.

Return type

np.array

cleanlab.rank.get_label_quality_ensemble_scores(labels: numpy.array, pred_probs_list: List[numpy.array], *, method: str = 'self_confidence', adjust_pred_probs: bool = False, weight_ensemble_members_by: str = 'accuracy', verbose: bool = True) → numpy.array[source]#

Returns label quality scores based on predictions from an ensemble of models.

This is a function to compute label-quality scores for classification datasets, where lower scores indicate labels less likely to be correct.

Ensemble scoring requires a list of pred_probs from each model in the ensemble.

For each pred_probs in list, compute label quality score. Take the average of the scores with the chosen weighting scheme determined by weight_ensemble_members_by.

Score is between 0 and 1:

1 — clean label (given label is likely correct).
0 — dirty label (given label is likely incorrect).

Parameters

labels (np.array) – Labels in the same format expected by the get_label_quality_scores function.
pred_probs_list (List[np.array]) – Each element in this list should be an array of pred_probs in the same format expected by the get_label_quality_scores function. Each element of pred_probs_list corresponds to the predictions from one model for all examples.
method ({"self_confidence", "normalized_margin", "confidence_weighted_entropy"}, default "self_confidence") – Label quality scoring method. See get_label_quality_scores for scenarios on when to use each method.
adjust_pred_probs (bool, optional) – adjust_pred_probs in the same format expected by the get_label_quality_scores function.
weight_ensemble_members_by ({"uniform", "accuracy"}, default "accuracy") –
Weighting scheme used to aggregate scores from each model:
- ”uniform”: take the simple average of scores
- ”accuracy”: take weighted average of scores, weighted by model accuracy
verbose (bool, default True) – Set to False to suppress all print statements.

Returns

label_quality_scores

Return type

np.array