multilabel_utils#

Helper functions used internally for multi-label classification tasks.

Functions:

`stack_complement`(pred_prob_slice)	Extends predicted probabilities of a single class to two columns.
`get_onehot_num_classes`(labels[, pred_probs])	Returns OneHot encoding of MultiLabel Data, and number of classes
`int2onehot`(labels, K)	Convert multi-label classification `labels` from a `List[List[int]]` format to a onehot matrix.
`onehot2int`(onehot_matrix)	Convert multi-label classification `labels` from a onehot matrix format to a `List[List[int]]` format that can be used with other cleanlab functions.

cleanlab.internal.multilabel_utils.stack_complement(pred_prob_slice)[source]#

Extends predicted probabilities of a single class to two columns.

Parameters:: pred_prob_slice (ndarray) – A 1D array with predicted probabilities for a single class.

Example

>>> pred_prob_slice = np.array([0.1, 0.9, 0.3, 0.8])
>>> stack_complement(pred_prob_slice)
array([[0.9, 0.1],
        [0.1, 0.9],
        [0.7, 0.3],
        [0.2, 0.8]])

Return type:: ndarray

cleanlab.internal.multilabel_utils.get_onehot_num_classes(labels, pred_probs=None)[source]#

Returns OneHot encoding of MultiLabel Data, and number of classes

Return type:: Tuple[ndarray, int]

cleanlab.internal.multilabel_utils.int2onehot(labels, K)[source]#

Convert multi-label classification labels from a List[List[int]] format to a onehot matrix. This returns a binarized format of the labels as a multi-hot vector for each example, where the entries in this vector are 1 for each class that applies to this example and 0 otherwise.

Parameters:

labels (list of lists of integers) – e.g. [[0,1], [3], [1,2,3], [1], [2]] All integers from 0,1,…,K-1 must be represented.
K (int) – The number of classes.

Return type:

ndarray

cleanlab.internal.multilabel_utils.onehot2int(onehot_matrix)[source]#

Convert multi-label classification labels from a onehot matrix format to a List[List[int]] format that can be used with other cleanlab functions.

Parameters:: onehot_matrix (2D np.ndarray of 0s and 1s) – A matrix representation of multi-label classification labels in a binarized format as a multi-hot vector for each example. The entries in this vector are 1 for each class that applies to this example and 0 otherwise.
Return type:: List[List[int]]
Returns:: labels (list of lists of integers) – e.g. [[0,1], [3], [1,2,3], [1], [2]] All integers from 0,1,…,K-1 must be represented.