# label_quality_utils#

Helper methods used internally for computing label quality scores

Functions:

 get_normalized_entropy(pred_probs[, ...]) Returns the normalized entropy of pred_probs.
cleanlab.internal.label_quality_utils.get_normalized_entropy(pred_probs, min_allowed_prob=1e-06)[source]#

Returns the normalized entropy of pred_probs.

Normalized entropy is between 0 and 1. Higher values of entropy indicate higher uncertainty in the model’s prediction of the correct label.

Normalized entropy is used in active learning for uncertainty sampling: https://towardsdatascience.com/uncertainty-sampling-cheatsheet-ec57bc067c0b

Unlike label-quality scores, entropy only depends on the model’s predictions, not the given label.

Parameters:
• pred_probs (ndarray) – Each row of this matrix corresponds to an example x and contains the model-predicted probabilities that x belongs to each possible class: P(label=k|x)

• min_allowed_prob (float) – Minimum allowed probability value. Entries of pred_probs below this value will be clipped to this value. Ensures entropy remains well-behaved even when pred_probs contains zeros.

Return type:

ndarray

Returns:

entropy – Each element is the normalized entropy of the corresponding row of pred_probs.