label_quality_utils#
Helper methods used internally for computing label quality scores.
Functions:
| 
 | Return the normalized entropy of pred_probs. | 
- cleanlab.internal.label_quality_utils.get_normalized_entropy(pred_probs, min_allowed_prob=None)[source]#
- Return the normalized entropy of pred_probs. - Normalized entropy is between 0 and 1. Higher values of entropy indicate higher uncertainty in the model’s prediction of the correct label. - Read more about normalized entropy on Wikipedia. - Normalized entropy is used in active learning for uncertainty sampling: https://towardsdatascience.com/uncertainty-sampling-cheatsheet-ec57bc067c0b - Unlike label-quality scores, entropy only depends on the model’s predictions, not the given label. - Parameters:
- pred_probs ( - np.ndarray (shape (N,- K))) – Each row of this matrix corresponds to an example x and contains the model-predicted probabilities that x belongs to each possible class: P(label=k|x)
- min_allowed_prob ( - float, default:- None,- deprecated) –- Minimum allowed probability value. If not - None(default), entries of- pred_probsbelow this value will be clipped to this value.- Deprecated since version 2.5.0: This keyword is deprecated and should be left to the default. The entropy is well-behaved even if - pred_probscontains zeros, clipping is unnecessary and (slightly) changes the results.
 
- Return type:
- ndarray
- Returns:
- entropy ( - np.ndarray (shape (N,- ))) – Each element is the normalized entropy of the corresponding row of- pred_probs.
- Raises:
- ValueError – An error is raised if any of the probabilities is not in the interval [0, 1].