summary#
Methods to display images and their label issues in a semantic segmentation dataset, as well as summarize the overall types of issues identified.
Functions:
|
Display semantic segmentation label issues, showing images with problematic pixels highlighted. |
|
Display the frequency of which label are swapped in the dataset. |
|
Return label issues involving particular class. |
- cleanlab.segmentation.summary.display_issues(issues, *, labels=None, pred_probs=None, class_names=None, exclude=None, top=None)[source]#
Display semantic segmentation label issues, showing images with problematic pixels highlighted.
Can also show given and predicted masks for each image identified to have label issue.
- Parameters:
issues (
ndarray
) –Boolean mask for the entire dataset where
True
represents a pixel label issue andFalse
represents an example that is accurately labeled.Same format as output by
segmentation.filter.find_label_issues
orsegmentation.rank.issues_from_scores
.labels (
Optional
[ndarray
]) – Optional discrete array of noisy labels for a segmantic segmentation dataset, in the shape(N,H,W,)
, where each pixel must be integer in 0, 1, …, K-1. Iflabels
is provided, this function also displays given label of the pixel identified with issue. Refer to documentation for this argument infind_label_issues
for more information.pred_probs (
Optional
[ndarray
]) –Optional array of shape
(N,K,H,W,)
of model-predicted class probabilities. Ifpred_probs
is provided, this function also displays predicted label of the pixel identified with issue. Refer to documentation for this argument infind_label_issues
for more information.Tip
If your labels are one hot encoded you can
np.argmax(labels_one_hot, axis=1)
assuming thatlabels_one_hot
is of dimension (N,K,H,W) before entering in the functionclass_names (
Optional
[List
[str
]]) –Optional list of strings, where each string represents the name of a class in the semantic segmentation problem. The order of the names should correspond to the numerical order of the classes. The list length should be equal to the number of unique classes present in the labels. If provided, this function will generate a legend showing the color mapping of each class in the provided colormap.
Example: If there are three classes in your labels, represented by 0, 1, 2, then class_names might look like this:
class_names = ['background', 'person', 'dog']
top (
Optional
[int
]) – Optional maximum number of issues to be printed. If not provided, a good default is used.exclude (
Optional
[List
[int
]]) – Optional list of label classes that can be ignored in the errors, each element must be 0, 1, …, K-1
- Return type:
None
- cleanlab.segmentation.summary.common_label_issues(issues, labels, pred_probs, *, class_names=None, exclude=None, top=None, verbose=True)[source]#
Display the frequency of which label are swapped in the dataset.
These may correspond to pixels that are ambiguous or systematically misunderstood by the data annotators.
N - Number of images in the dataset
K - Number of classes in the dataset
H - Height of each image
W - Width of each image
- Parameters:
issues (
ndarray
) –Boolean mask for the entire dataset where
True
represents a pixel label issue andFalse
represents an example that is accurately labeled.Same format as output by
segmentation.filter.find_label_issues
orsegmentation.rank.issues_from_scores
.labels (
ndarray
) – A discrete array of noisy labels for a segmantic segmentation dataset, in the shape(N,H,W,)
. where each pixel must be integer in 0, 1, …, K-1. Refer to documentation for this argument infind_label_issues
for more information.pred_probs (
ndarray
) –An array of shape
(N,K,H,W,)
of model-predicted class probabilities. Refer to documentation for this argument infind_label_issues
for more information.Tip
If your labels are one hot encoded you can
np.argmax(labels_one_hot, axis=1)
assuming thatlabels_one_hot
is of dimension (N,K,H,W) before entering in the functionclass_names (
Optional
[List
[str
]]) – Optional length K list of names of each class, such thatclass_names[i]
is the string name of the class corresponding tolabels
with valuei
. Ifclass_names
is provided, display these string names for predicted and given labels, otherwise display the integer index of classes.exclude (
Optional
[List
[int
]]) – Optional list of label classes that can be ignored in the errors, each element must be in 0, 1, …, K-1.top (
Optional
[int
]) – Optional maximum number of tokens to print information for. If not provided, a good default is used.verbose (
bool
) – Set toFalse
to suppress all print statements.
- Return type:
DataFrame
- Returns:
issues_df
– DataFrame with columns['given_label', 'predicted_label', 'num_label_issues']
where each row contains information about a particular given/predicted label swap. Rows are ordered by the number of label issues inferred to exhibit this type of label swap.
- cleanlab.segmentation.summary.filter_by_class(class_index, issues, labels, pred_probs)[source]#
Return label issues involving particular class. Note that this includes errors where the given label is the class of interest, and the predicted label is any other class.
- Parameters:
class_index (
int
) – The specific class you are interested in.issues (
ndarray
) –Boolean mask for the entire dataset where
True
represents a pixel label issue andFalse
represents an example that is accurately labeled.Same format as output by
segmentation.filter.find_label_issues
orsegmentation.rank.issues_from_scores
.labels (
ndarray
) – A discrete array of noisy labels for a segmantic segmentation dataset, in the shape(N,H,W,)
, where each pixel must be integer in 0, 1, …, K-1. Refer to documentation for this argument infind_label_issues
for further details.pred_probs (
ndarray
) – An array of shape(N,K,H,W,)
of model-predicted class probabilities. Refer to documentation for this argument infind_label_issues
for further details.
- Return type:
ndarray
- Returns:
issues_subset
– Boolean mask for the subset dataset whereTrue
represents a pixel label issue andFalse
represents an example that is accurately labeled for the labeled class.Returned mask shows all instances that involve the particular class of interest.