multilabel_utils#

Helper functions used internally for multi-label classification tasks.

Functions:

stack_complement(pred_prob_slice)

Extends predicted probabilities of a single class to two columns.

get_onehot_num_classes(labels[, pred_probs])

Returns OneHot encoding of MultiLabel Data, and number of classes

int2onehot(labels, K)

Convert multi-label classification labels from a List[List[int]] format to a onehot matrix.

onehot2int(onehot_matrix)

Convert multi-label classification labels from a onehot matrix format to a List[List[int]] format that can be used with other cleanlab functions.

cleanlab.internal.multilabel_utils.stack_complement(pred_prob_slice)[source]#

Extends predicted probabilities of a single class to two columns.

Parameters:

pred_prob_slice (ndarray) – A 1D array with predicted probabilities for a single class.

Example

>>> pred_prob_slice = np.array([0.1, 0.9, 0.3, 0.8])
>>> stack_complement(pred_prob_slice)
array([[0.9, 0.1],
        [0.1, 0.9],
        [0.7, 0.3],
        [0.2, 0.8]])
Return type:

ndarray

cleanlab.internal.multilabel_utils.get_onehot_num_classes(labels, pred_probs=None)[source]#

Returns OneHot encoding of MultiLabel Data, and number of classes

Return type:

Tuple[ndarray, int]

cleanlab.internal.multilabel_utils.int2onehot(labels, K)[source]#

Convert multi-label classification labels from a List[List[int]] format to a onehot matrix. This returns a binarized format of the labels as a multi-hot vector for each example, where the entries in this vector are 1 for each class that applies to this example and 0 otherwise.

Parameters:
  • labels (list of lists of integers) – e.g. [[0,1], [3], [1,2,3], [1], [2]] All integers from 0,1,…,K-1 must be represented.

  • K (int) – The number of classes.

Return type:

ndarray

cleanlab.internal.multilabel_utils.onehot2int(onehot_matrix)[source]#

Convert multi-label classification labels from a onehot matrix format to a List[List[int]] format that can be used with other cleanlab functions.

Parameters:

onehot_matrix (2D np.ndarray of 0s and 1s) – A matrix representation of multi-label classification labels in a binarized format as a multi-hot vector for each example. The entries in this vector are 1 for each class that applies to this example and 0 otherwise.

Return type:

List[List[int]]

Returns:

labels (list of lists of integers) – e.g. [[0,1], [3], [1,2,3], [1], [2]] All integers from 0,1,…,K-1 must be represented.