label_quality_utils#
Helper functions for computing label quality scores
Functions:

Returns the normalized entropy of pred_probs. 
 cleanlab.internal.label_quality_utils.get_normalized_entropy(pred_probs: numpy.array, min_allowed_prob=1e06) numpy.array [source]#
Returns the normalized entropy of pred_probs.
Normalized entropy is between 0 and 1. Higher values of entropy indicate higher uncertainty in the model’s prediction of the correct label.
Read more about normalized entropy on Wikipedia.
Normalized entropy is used in active learning for uncertainty sampling: https://towardsdatascience.com/uncertaintysamplingcheatsheetec57bc067c0b
Unlike labelquality scores, entropy only depends on the model’s predictions, not the given label.
 Parameters
pred_probs (
np.array (shape (N
,K))
) – P(label=kx) is a matrix with K modelpredicted probabilities. Each row of this matrix corresponds to an example x and contains the modelpredicted probabilities that x belongs to each possible class. The columns must be ordered such that these probabilities correspond to class 0,1,2,… pred_probs should have been computed using 3 (or higher) fold crossvalidation.min_allowed_prob (
float
, default1e6
) – Minimum allowed probability value. Entries of pred_probs below this value will be clipped to this value. Ensures entropy remains wellbehaved even when pred_probs contains zeros.
 Returns
entropy
 Return type
np.array (float)