outlier#
Helper functions used internally for outlier detection tasks.
Functions:
|
Returns an outlier score for each example based on its average distance to its k nearest neighbors. |
- cleanlab.internal.outlier.transform_distances_to_scores(distances, k, t)[source]#
Returns an outlier score for each example based on its average distance to its k nearest neighbors.
The transformation of a distance, , to a score, , is based on the following formula:
where scales the distance to a score in the range [0,1].
- Parameters:
distances (
np.ndarray
) – An array of distances of shape(N, num_neighbors)
, where N is the number of examples. Each row contains the distances to each example’snum_neighbors
nearest neighbors. It is assumed that each row is sorted in ascending order.k (
int
) – Number of neighbors used to compute the average distance to each example. This assumes that the second dimension of distances is k or greater, but it uses slicing to avoid indexing errors.t (
int
) – Controls transformation of distances between examples into similarity scores that lie in [0,1].
- Return type:
ndarray
- Returns:
ood_features_scores (
np.ndarray
) – An array of outlier scores of shape(N,)
for N examples.
Examples
>>> import numpy as np >>> from cleanlab.outlier import transform_distances_to_scores >>> distances = np.array([[0.0, 0.1, 0.25], ... [0.15, 0.2, 0.3]]) >>> transform_distances_to_scores(distances, k=2, t=1) array([0.95122942, 0.83945702])