multiannotator_utils#
Helper methods used internally in cleanlab.multiannotator
Functions:
|
Validate format of multi-annotator labels |
|
Validate format of pred_probs for multiannotator active learning functions |
|
Takes an array of labels and formats it such that labels are in the set |
Check if any classes no longer appear in the set of consensus labels (established using the consensus_method stated) |
|
Compute soft cross entropy between the annotators' empirical label distribution and model pred_probs |
|
|
Find the best temperature scaling factor that minimizes the soft cross entropy between the annotators' empirical label distribution and model pred_probs |
|
Scales pred_probs by the given temperature factor. |
- cleanlab.internal.multiannotator_utils.assert_valid_inputs_multiannotator(labels_multiannotator, pred_probs=None, ensemble=False, allow_single_label=False)[source]#
Validate format of multi-annotator labels
- Return type:
None
- cleanlab.internal.multiannotator_utils.assert_valid_pred_probs(pred_probs=None, pred_probs_unlabeled=None, ensemble=False)[source]#
Validate format of pred_probs for multiannotator active learning functions
- cleanlab.internal.multiannotator_utils.format_multiannotator_labels(labels)[source]#
Takes an array of labels and formats it such that labels are in the set
0, 1, ..., K-1
, whereK
is the number of classes. The labels are assigned based on lexicographic order.- Return type:
Tuple
[DataFrame
,dict
]- Returns:
formatted_labels
– Returns pd.DataFrame of shape(N,M)
. The return labels will be properly formatted and can be passed to cleanlab.multiannotator functions.mapping
– A dictionary showing the mapping of new to old labels, such thatmapping[k]
returns the name of the k-th class.
- cleanlab.internal.multiannotator_utils.check_consensus_label_classes(labels_multiannotator, consensus_label, consensus_method)[source]#
Check if any classes no longer appear in the set of consensus labels (established using the consensus_method stated)
- Return type:
None
- cleanlab.internal.multiannotator_utils.compute_soft_cross_entropy(labels_multiannotator, pred_probs)[source]#
Compute soft cross entropy between the annotators’ empirical label distribution and model pred_probs
- Return type:
float
- cleanlab.internal.multiannotator_utils.find_best_temp_scaler(labels_multiannotator, pred_probs, coarse_search_range=[0.1, 0.2, 0.5, 0.8, 1, 2, 3, 5, 8], fine_search_size=4)[source]#
Find the best temperature scaling factor that minimizes the soft cross entropy between the annotators’ empirical label distribution and model pred_probs
- Return type:
float