Outlier Detection Clause Samples

Outlier Detection. The ME will screen all acceptance cores for outliers using a statistically valid procedure. If an outlier is detected, replace that core by taking an additional core at the same offset and within 5 feet of the original station. The following procedure applies only for a sample size of 5. 1. The ME will arrange the 5 core results in ascending order, in which X1 represents the smallest value and X5 represents the largest value. 2. If X5 is suspected of being an outlier, the ME will calculate: R = X 5 - X 4 3. If X1 is suspected of being an outlier, the ME will calculate: 4. If R > 0.642, the value is judged to be statistically significant and the core is excluded.
Outlier Detection. ‌ Outlier detection (OD) is defined in [6] as the process of eliminating samples from the database that do not comply with the general behavior of the data model. Such samples, called outliers, differ significantly from the remaining records in the data set, and their existence can be either caused by measurement error or they may be the result of inherent data variability. Outliers can be detected via non-algorithmic methods (for example, as seen in in [14]) however, more recently, algorithmic methods are the preferred method of detection. While OD methods can be classified following a taxonomy similar to the one used for FS methods, in the literature, OD is first separated into supervised and unsupervised methods. Supervised OD defines outlier detection as a classification process. It requires a previous labeling of those instances that are known to be outliers, and then it focuses on training a classifier to recognize further outliers. Examples of supervised methods, which are focused on anomalous, unexpected, or fraudulent behavior detecting include [15–19]. The most relevant difference between supervised OD and a simple supervised classification problem is derived from the intrinsic unbalancing of data in the latter case. When referring to generic outlier detection, unsupervised OD, which is more common than a supervised OD, is further separated into density - (see, e.g., [20] and many other variants of the algorithm LOF) and distance-based methods (see, e.g. [21, 22]). In both cases, these methods treat the OD problem as a sort of clustering problem. In our terminology, all these methods can be classified as filters (unsupervised and supervised, univariate and multivariate), as the process of detecting outliers is not linked to any successive learning task or algorithm. Univariate, unsupervised filters for OD are also the most commonly available methods. For example, the open-source suite Weka offers the filter InterquartileRange, which allows for the detection of outliers and extreme values by looking at the values of any attribute, under a normal distribution hypothesis. Embedded models for OD in regression have been attempted. In [23], for example, multiple outliers are detected during the learning process, and in [24, 25] a genetic programming-based approach to outlier detection has been presented. A comprehensive comparison among several OD methods can be found in [26]. Outlier detection methods can be also classified as filters, wrappers, and e...