multiannotator_utils#
Helper methods used internally in cleanlab.multiannotator
Functions:
  | 
Validate format of multi-annotator labels  | 
  | 
Validate format of pred_probs for multiannotator active learning functions  | 
  | 
Takes an array of labels and formats it such that labels are in the set   | 
Check if any classes no longer appear in the set of consensus labels (established using the consensus_method stated)  | 
|
Compute soft cross entropy between the annotators' empirical label distribution and model pred_probs  | 
|
  | 
Find the best temperature scaling factor that minimizes the soft cross entropy between the annotators' empirical label distribution and model pred_probs  | 
  | 
Scales pred_probs by the given temperature factor.  | 
- cleanlab.internal.multiannotator_utils.assert_valid_inputs_multiannotator(labels_multiannotator, pred_probs=None, ensemble=False, allow_single_label=False, annotator_ids=None)[source]#
 Validate format of multi-annotator labels
- Return type:
 None
- cleanlab.internal.multiannotator_utils.assert_valid_pred_probs(pred_probs=None, pred_probs_unlabeled=None, ensemble=False)[source]#
 Validate format of pred_probs for multiannotator active learning functions
- cleanlab.internal.multiannotator_utils.format_multiannotator_labels(labels)[source]#
 Takes an array of labels and formats it such that labels are in the set
0, 1, ..., K-1, whereKis the number of classes. The labels are assigned based on lexicographic order.- Return type:
 Tuple[DataFrame,dict]- Returns:
 formatted_labels– Returns pd.DataFrame of shape(N,M). The return labels will be properly formatted and can be passed to cleanlab.multiannotator functions.mapping– A dictionary showing the mapping of new to old labels, such thatmapping[k]returns the name of the k-th class.
- cleanlab.internal.multiannotator_utils.check_consensus_label_classes(labels_multiannotator, consensus_label, consensus_method)[source]#
 Check if any classes no longer appear in the set of consensus labels (established using the consensus_method stated)
- Return type:
 None
- cleanlab.internal.multiannotator_utils.compute_soft_cross_entropy(labels_multiannotator, pred_probs)[source]#
 Compute soft cross entropy between the annotators’ empirical label distribution and model pred_probs
- Return type:
 float
- cleanlab.internal.multiannotator_utils.find_best_temp_scaler(labels_multiannotator, pred_probs, coarse_search_range=[0.1, 0.2, 0.5, 0.8, 1, 2, 3, 5, 8], fine_search_size=4)[source]#
 Find the best temperature scaling factor that minimizes the soft cross entropy between the annotators’ empirical label distribution and model pred_probs
- Return type:
 float