factory#

The factory module provides a factory class for constructing concrete issue managers and a decorator for registering new issue managers.

This module provides the register() decorator for users to register new subclasses of IssueManager in the registry. Each IssueManager detects some particular type of issue in a dataset.

Note

The REGISTRY variable is used by the factory class to keep track of registered issue managers. The factory class is used as an implementation detail by Datalab, which provides a simplified API for constructing concrete issue managers. Datalab is intended to be used by users and provides detailed documentation on how to use the API.

Warning

Neither the REGISTRY variable nor the factory class should be used directly by users.

Data:

REGISTRY

Registry of issue managers that can be constructed from a task and issue type and used in the Datalab class.

Functions:

register(cls[, task])

Registers the issue manager factory.

list_possible_issue_types(task)

Returns a list of all registered issue types.

list_default_issue_types(task)

Returns a list of the issue types that are run by default when find_issues() is called without specifying issue_types.

cleanlab.datalab.internal.issue_manager_factory.REGISTRY: Dict[Task, Dict[str, Type[IssueManager]]]#

Registry of issue managers that can be constructed from a task and issue type and used in the Datalab class.

Currently, the following issue managers are registered by default for a given task:

Warning

This variable should not be used directly by users.

cleanlab.datalab.internal.issue_manager_factory.register(cls, task='classification')[source]#

Registers the issue manager factory.

Parameters:
  • cls (Type[IssueManager]) – A subclass of IssueManager.

  • task (str) – Specific machine learning task like classification or regression. See Task.from_str <cleanlab.datalab.internal.task.Task.from_str>`() for more details, to see which task type corresponds to which string.

Return type:

Type[IssueManager]

Returns:

cls – The same class that was passed in.

Example

When defining a new subclass of IssueManager, you can register it like so:

from cleanlab import IssueManager
from cleanlab.datalab.internal.issue_manager_factory import register

@register
class MyIssueManager(IssueManager):
    issue_name: str = "my_issue"
    def find_issues(self, **kwargs):
        # Some logic to find issues
        pass

or in a function call:

from cleanlab import IssueManager
from cleanlab.datalab.internal.issue_manager_factory import register

class MyIssueManager(IssueManager):
    issue_name: str = "my_issue"
    def find_issues(self, **kwargs):
        # Some logic to find issues
        pass

register(MyIssueManager, task="classification")
cleanlab.datalab.internal.issue_manager_factory.list_possible_issue_types(task)[source]#

Returns a list of all registered issue types.

Any issue type that is not in this list cannot be used in the find_issues() method. :rtype: List[str]

See also

REGISTRY : All available issue types and their corresponding issue managers can be found here.

cleanlab.datalab.internal.issue_manager_factory.list_default_issue_types(task)[source]#

Returns a list of the issue types that are run by default when find_issues() is called without specifying issue_types.

Return type:

List[str]

task :

Specific machine learning task supported by Datalab.

See also

REGISTRY : All available issue types and their corresponding issue managers can be found here.