factory#

The factory module provides a factory class for constructing concrete issue managers and a decorator for registering new issue managers.

This module provides the register() decorator for users to register new subclasses of IssueManager in the registry. Each IssueManager detects some particular type of issue in a dataset.

Note

The REGISTRY variable is used by the factory class to keep track of registered issue managers. The factory class is used as an implementation detail by Datalab, which provides a simplified API for constructing concrete issue managers. Datalab is intended to be used by users and provides detailed documentation on how to use the API.

Warning

Neither the REGISTRY variable nor the factory class should be used directly by users.

Data:

REGISTRY

Registry of issue managers that can be constructed from a string and used in the Datalab class.

Functions:

register(cls)

Registers the issue manager factory.

cleanlab.datalab.internal.issue_manager_factory.REGISTRY: Dict[str, Type[IssueManager]]#

Registry of issue managers that can be constructed from a string and used in the Datalab class.

Currently, the following issue managers are registered by default:

Warning

This variable should not be used directly by users.

cleanlab.datalab.internal.issue_manager_factory.register(cls)[source]#

Registers the issue manager factory.

Parameters:

cls (Type[IssueManager]) – A subclass of IssueManager.

Return type:

Type[IssueManager]

Returns:

cls – The same class that was passed in.

Example

When defining a new subclass of IssueManager, you can register it like so:

from cleanlab import IssueManager
from cleanlab.datalab.internal.issue_manager_factory import register

@register
class MyIssueManager(IssueManager):
    issue_name: str = "my_issue"
    def find_issues(self, **kwargs):
        # Some logic to find issues
        pass

or in a function call:

from cleanlab import IssueManager
from cleanlab.datalab.internal.issue_manager_factory import register

class MyIssueManager(IssueManager):
    issue_name: str = "my_issue"
    def find_issues(self, **kwargs):
        # Some logic to find issues
        pass

register(MyIssueManager)