Hi, I'm Ranya

PhD in computer science, WVU.

Similarities and Differences Between Out-of-Distribution (OOD) and Other Neighboring Problems

3 minutes

February 17, 2023

There are two general types of distribution shift: covariate shift (e.g., OOD samples from a different domain) and semantic (e.g., OOD samples are drawn from different classes, label) shift. Formally, let X be the input (sensory) space and Y be the label (semantic) space, a data distribution is defined as a joint distribution P(X, Y ) over X × Y. Distribution shift can occur in either the marginal distribution P(X), or both P(Y ) and P(X). Note that shift in P(Y ) naturally triggers shift in P(X). Examples of covariate distribution shift on P(X) include adversarial examples, domain shift, and style changes. Importantly, covariate shift is more commonly used to evaluate model generalization and robustness performance, where the label space Y remains the same during test time. On the other hand, the detection of semantic distribution shift is the focal point of many detection tasks where the label space Y can be different between ID and OOD data and hence the model should not make any prediction.

The detection of semantic distribution shift (e.g., due to the occurrence of new classes) is the focal point of OOD detection tasks, where the label space Y can be different between ID and OOD data and hence the model should not make any prediction. In addition to OOD detection, several problems adopt the “open-world” assumption and have a similar goal of identifying OOD examples. theses include outlier detection (OD), anomaly detection (AD), novelty detection (ND), and open set recognition (OSR).

Definition Anomaly detection (AD):

aims to detect any anomalous samples that are deviated from the predefined normality during testing. The deviation can happen due to either covariate shift or semantic shift, which leads to two sub-tasks: sensory AD and semantic AD, respectively

Definition Novelty detection (ND):

aims to detect any test samples that do not fall into any training category. Based on the number of training classes, ND contains two different settings: 1) oneclass novelty detection (one-class ND): only one class exists in the training set; 2) multi-class novelty detection (multi-class ND): multiple classes exist in the training set.The goal of multi-class ND is only to distinguish novel samples from ID. Both one-class and multi-class ND are formulated as binary classification problems.

Refrence: Generalized Out-of-Distribution Detection:A Survey