multilabel_scorer#

Helper classes and functions used internally to compute label quality scores in multi-label classification.

Classes:

`ClassLabelScorer`(value[, names, module, ...])	Enum for the different methods to compute label quality scores.
`Aggregator`(method, **kwargs)	Helper class for aggregating the label quality scores for each class into a single score for each datapoint.
`MultilabelScorer`([base_scorer, aggregator, ...])	Aggregates label quality scores across different classes to produce one score per example in multi-label classification tasks.

Functions:

`exponential_moving_average`(s, *[, alpha, axis])	Exponential moving average (EMA) score aggregation function.
`softmin`(s, *[, temperature, axis])	Softmin score aggregation function.
`get_label_quality_scores`(labels, pred_probs, *)	Computes a quality score for each label in a multi-label classification problem based on out-of-sample predicted probabilities.
`multilabel_py`(y)	Compute the prior probability of each label in a multi-label classification problem.
`get_cross_validated_multilabel_pred_probs`(X, ...)	Get predicted probabilities for a multi-label classifier via cross-validation.

class cleanlab.internal.multilabel_scorer.ClassLabelScorer(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

Enum for the different methods to compute label quality scores.

Attributes:

`SELF_CONFIDENCE`(args, *kwargs)	Returns the self-confidence label-quality score for each datapoint.
`NORMALIZED_MARGIN`(args, *kwargs)	Returns the "normalized margin" label-quality score for each datapoint.
`CONFIDENCE_WEIGHTED_ENTROPY`(args, *kwargs)	Returns the "confidence weighted entropy" label-quality score for each datapoint.

Methods:

`__call__`(labels, pred_probs, **kwargs)	Returns the label-quality scores for each datapoint based on the given labels and predicted probabilities.
`from_str`(method)	Constructs an instance of the ClassLabelScorer enum based on the given method name.
`__contains__`(member)	Return True if member is a member of this enum raises TypeError if member is not an enum member
`__getitem__`(name)	Return the member matching name.
`__iter__`()	Return members in definition order.
`__len__`()	Return the number of members (no aliases)

SELF_CONFIDENCE(*args, **kwargs) = get_self_confidence_for_each_label#: Returns the self-confidence label-quality score for each datapoint.

See also

cleanlab.rank.get_self_confidence_for_each_label

NORMALIZED_MARGIN(*args, **kwargs) = get_normalized_margin_for_each_label#: Returns the “normalized margin” label-quality score for each datapoint.

See also

cleanlab.rank.get_normalized_margin_for_each_label

CONFIDENCE_WEIGHTED_ENTROPY(*args, **kwargs) = get_confidence_weighted_entropy_for_each_label#: Returns the “confidence weighted entropy” label-quality score for each datapoint.

See also

cleanlab.rank.get_confidence_weighted_entropy_for_each_label

__call__(labels, pred_probs, **kwargs)[source]#

Returns the label-quality scores for each datapoint based on the given labels and predicted probabilities.

See the documentation for each method for more details.

Return type:: ndarray

Example

>>> import numpy as np
>>> from cleanlab.internal.multilabel_scorer import ClassLabelScorer
>>> labels = np.array([0, 0, 0, 1, 1, 1])
>>> pred_probs = np.array([
...     [0.9, 0.1],
...     [0.8, 0.2],
...     [0.7, 0.3],
...     [0.2, 0.8],
...     [0.75, 0.25],
...     [0.1, 0.9],
... ])
>>> ClassLabelScorer.SELF_CONFIDENCE(labels, pred_probs)
array([0.9 , 0.8 , 0.7 , 0.8 , 0.25, 0.9 ])

classmethod from_str(method)[source]#

Constructs an instance of the ClassLabelScorer enum based on the given method name.

Parameters:: method (str) – The name of the scoring method to use.
Return type:: ClassLabelScorer
Returns:: scorer – An instance of the ClassLabelScorer enum.
Raises:: ValueError: – If the given method name is not a valid method name. It must be one of the following: “self_confidence”, “normalized_margin”, or “confidence_weighted_entropy”.

Example

>>> from cleanlab.internal.multilabel_scorer import ClassLabelScorer
>>> ClassLabelScorer.from_str("self_confidence")
<ClassLabelScorer.SELF_CONFIDENCE: get_self_confidence_for_each_label>

classmethod __contains__(member)#

Return True if member is a member of this enum raises TypeError if member is not an enum member

note: in 3.12 TypeError will no longer be raised, and True will also be returned if member is the value of a member in this enum

classmethod __getitem__(name)#: Return the member matching name.

classmethod __iter__()#: Return members in definition order.

classmethod __len__()#: Return the number of members (no aliases)

cleanlab.internal.multilabel_scorer.exponential_moving_average(s, *, alpha=None, axis=1, **_)[source]#

Exponential moving average (EMA) score aggregation function.

For a score vector s = (s_1, …, s_K) with K scores, the values are sorted in descending order and the exponential moving average of the last score is calculated, denoted as EMA_K according to the note below.

Note

The recursive formula for the EMA at step $t = 2, ..., K$ is:

\text{EMA}_t = \alpha \cdot s_t + (1 - \alpha) \cdot \text{EMA}_{t-1}, \qquad 0 \leq \alpha \leq 1

We set $\text{EMA}_1 = s_1$ as the largest score in the sorted vector s.

$\alpha$ is the “forgetting factor” that gives more weight to the most recent scores, and successively less weight to the previous scores.

Parameters:

s (ndarray) – Scores to be transformed.
alpha (Optional[float]) –
Discount factor that determines the weight of the previous EMA score. Higher alpha means that the previous EMA score has a lower weight while the current score has a higher weight.

Its value must be in the interval [0, 1].

If alpha is None, it is set to 2 / (K + 1) where K is the number of scores.
axis (int) – Axis along which the scores are sorted.

Return type:

ndarray

Returns:

s_ema – Exponential moving average score.

Examples

>>> from cleanlab.internal.multilabel_scorer import exponential_moving_average
>>> import numpy as np
>>> s = np.array([[0.1, 0.2, 0.3]])
>>> exponential_moving_average(s, alpha=0.5)
np.array([0.175])

cleanlab.internal.multilabel_scorer.softmin(s, *, temperature=0.1, axis=1, **_)[source]#

Softmin score aggregation function.

Parameters:

s (ndarray) – Input array.
temperature (float) – Temperature parameter. Too small values may cause numerical underflow and NaN scores.
axis (int) – Axis along which to apply the function.

Return type:

ndarray

Returns:

Softmin score.

class cleanlab.internal.multilabel_scorer.Aggregator(method, **kwargs)[source]#

Bases: object

Helper class for aggregating the label quality scores for each class into a single score for each datapoint.

Parameters:

method (Union[str, Callable]) – The method to compute the label quality scores for each class. If passed as a callable, your function should take in a 1D array of K scores and return a single aggregated score. See ~cleanlab.internal.multilabel_scorer.exponential_moving_average for an example of such a function. Alternatively, this can be a str value to specify a built-in function, possible values are the keys of the Aggregator’s possible_methods attribute.
kwargs – Additional keyword arguments to pass to the aggregation function when it is called.

Attributes:

possible_methods

Methods:

__call__(scores, **kwargs)

Returns the label quality scores for each datapoint based on the given label quality scores for each class.

possible_methods: Dict[str, Callable[..., ndarray]] = {'exponential_moving_average': <function exponential_moving_average>, 'softmin': <function softmin>}#

__call__(scores, **kwargs)[source]#

Returns the label quality scores for each datapoint based on the given label quality scores for each class.

Parameters:: scores (ndarray) – The label quality scores for each class.
Return type:: ndarray
Returns:: aggregated_scores – A single label quality score for each datapoint.

class cleanlab.internal.multilabel_scorer.MultilabelScorer(base_scorer=ClassLabelScorer.SELF_CONFIDENCE, aggregator=Aggregator(method=exponential_moving_average, kwargs={'alpha': 0.8}), *, strict=True)[source]#

Bases: object

Aggregates label quality scores across different classes to produce one score per example in multi-label classification tasks.

Parameters:

base_scorer (ClassLabelScorer) –
The method to compute the label quality scores for each class.

See the documentation for the ClassLabelScorer enum for more details.
aggregator (Union[Aggregator, Callable]) –
The method to aggregate the label quality scores for each class into a single score for each datapoint.

Defaults to the EMA (exponential moving average) aggregator with forgetting factor alpha=0.8.

See the documentation for the Aggregator class for more details.

See also

exponential_moving_average
strict (bool) – Flag for performing strict validation of the input data.

Methods:

`__call__`(labels, pred_probs[, ...])	Computes a quality score for each label in a multi-label classification problem based on out-of-sample predicted probabilities.
`aggregate`(class_label_quality_scores, **kwargs)	Aggregates the label quality scores for each class into a single overall label quality score for each example.
`get_class_label_quality_scores`(labels, ...)	Computes separate label quality scores for each class.

__call__(labels, pred_probs, base_scorer_kwargs=None, **aggregator_kwargs)[source]#

Computes a quality score for each label in a multi-label classification problem based on out-of-sample predicted probabilities. For each example, the label quality scores for each class are aggregated into a single overall label quality score.

Parameters:

labels (ndarray) – A 2D array of shape (n_samples, n_labels) with binary labels.
pred_probs (ndarray) – A 2D array of shape (n_samples, n_labels) with predicted probabilities.
kwargs – Additional keyword arguments to pass to the base_scorer and the aggregator.
base_scorer_kwargs (Optional[dict]) –
Keyword arguments to pass to the base_scorer

aggregator_kwargs:
Additional keyword arguments to pass to the aggregator.

Return type:

ndarray

Returns:

scores – A 1D array of shape (n_samples,) with the quality scores for each datapoint.

Examples

>>> from cleanlab.internal.multilabel_scorer import MultilabelScorer, ClassLabelScorer
>>> import numpy as np
>>> labels = np.array([[0, 1, 0], [1, 0, 1]])
>>> pred_probs = np.array([[0.1, 0.9, 0.1], [0.4, 0.1, 0.9]])
>>> scorer = MultilabelScorer()
>>> scores = scorer(labels, pred_probs)
>>> scores
array([0.9, 0.5])

>>> scorer = MultilabelScorer(
...     base_scorer = ClassLabelScorer.NORMALIZED_MARGIN,
...     aggregator = np.min,  # Use the "worst" label quality score for each example.
... )
>>> scores = scorer(labels, pred_probs)
>>> scores
array([0.9, 0.4])

aggregate(class_label_quality_scores, **kwargs)[source]#

Aggregates the label quality scores for each class into a single overall label quality score for each example.

Parameters:

class_label_quality_scores (ndarray) –
A 2D array of shape (n_samples, n_labels) with the label quality scores for each class.

See also

get_class_label_quality_scores
kwargs – Additional keyword arguments to pass to the aggregator.

Return type:

ndarray

Returns:

scores – A 1D array of shape (n_samples,) with the quality scores for each datapoint.

Examples

>>> from cleanlab.internal.multilabel_scorer import MultilabelScorer
>>> import numpy as np
>>> class_label_quality_scores = np.array([[0.9, 0.9, 0.3],[0.4, 0.9, 0.6]])
>>> scorer = MultilabelScorer() # Use the default aggregator (exponential moving average) with default parameters.
>>> scores = scorer.aggregate(class_label_quality_scores)
>>> scores
array([0.42, 0.452])
>>> new_scores = scorer.aggregate(class_label_quality_scores, alpha=0.5) # Use the default aggregator with custom parameters.
>>> new_scores
array([0.6, 0.575])

Warning

Make sure that keyword arguments correspond to the aggregation function used. I.e. the exponential_moving_average function supports an alpha keyword argument, but np.min does not.

get_class_label_quality_scores(labels, pred_probs, base_scorer_kwargs=None)[source]#

Computes separate label quality scores for each class.

Parameters:

labels (ndarray) – A 2D array of shape (n_samples, n_labels) with binary labels.
pred_probs (ndarray) – A 2D array of shape (n_samples, n_labels) with predicted probabilities.
base_scorer_kwargs (Optional[dict]) – Keyword arguments to pass to the base scoring-function.

Return type:

ndarray

Returns:

class_label_quality_scores – A 2D array of shape (n_samples, n_labels) with the quality scores for each label.

Examples

>>> from cleanlab.internal.multilabel_scorer import MultilabelScorer
>>> import numpy as np
>>> labels = np.array([[0, 1, 0], [1, 0, 1]])
>>> pred_probs = np.array([[0.1, 0.9, 0.7], [0.4, 0.1, 0.6]])
>>> scorer = MultilabelScorer() # Use the default base scorer (SELF_CONFIDENCE)
>>> class_label_quality_scores = scorer.get_label_quality_scores_per_class(labels, pred_probs)
>>> class_label_quality_scores
array([[0.9, 0.9, 0.3],
       [0.4, 0.9, 0.6]])

cleanlab.internal.multilabel_scorer.get_label_quality_scores(labels, pred_probs, *, method=<cleanlab.internal.multilabel_scorer.MultilabelScorer object>, base_scorer_kwargs=None, **aggregator_kwargs)[source]#

Computes a quality score for each label in a multi-label classification problem based on out-of-sample predicted probabilities.

Parameters:

labels – A 2D array of shape (N, K) with binary labels.
pred_probs – A 2D array of shape (N, K) with predicted probabilities.
method (MultilabelScorer) – A scoring+aggregation method for computing the label quality scores of examples in a multi-label classification setting.
base_scorer_kwargs (Optional[dict]) – Keyword arguments to pass to the class-label scorer.
aggregator_kwargs – Additional keyword arguments to pass to the aggregator.

Return type:

ndarray

Returns:

scores – A 1D array of shape (N,) with the quality scores for each datapoint.

Examples

>>> import cleanlab.internal.multilabel_scorer as ml_scorer
>>> import numpy as np
>>> labels = np.array([[0, 1, 0], [1, 0, 1]])
>>> pred_probs = np.array([[0.1, 0.9, 0.1], [0.4, 0.1, 0.9]])
>>> scores = ml_scorer.get_label_quality_scores(labels, pred_probs, method=ml_scorer.MultilabelScorer())
>>> scores
array([0.9, 0.5])