label_quality_utils#

Helper methods used internally for computing label quality scores.

Functions:

get_normalized_entropy(pred_probs[, ...])

Return the normalized entropy of pred_probs.

cleanlab.internal.label_quality_utils.get_normalized_entropy(pred_probs, min_allowed_prob=None)[source]#

Return the normalized entropy of pred_probs.

Normalized entropy is between 0 and 1. Higher values of entropy indicate higher uncertainty in the model’s prediction of the correct label.

Read more about normalized entropy on Wikipedia.

Unlike label-quality scores, entropy only depends on the model’s predictions, not the given label.

Parameters:

pred_probs (np.ndarray (shape (N, K))) – Each row of this matrix corresponds to an example x and contains the model-predicted probabilities that x belongs to each possible class: P(label=k|x)
min_allowed_prob (float, default: None, deprecated) –
Minimum allowed probability value. If not None (default), entries of pred_probs below this value will be clipped to this value.

Deprecated since version 2.5.0: This keyword is deprecated and should be left to the default. The entropy is well-behaved even if pred_probs contains zeros, clipping is unnecessary and (slightly) changes the results.

Return type:

ndarray

Returns:

entropy (np.ndarray (shape (N, ))) – Each element is the normalized entropy of the corresponding row of pred_probs.

Raises:

ValueError – An error is raised if any of the probabilities is not in the interval [0, 1].