Instance Selection Clause Samples

Instance Selection. ‌ Instance selection (IS) is defined in [5] as the process that selects or searches for a representative portion of data that can fulfill a knowledge extraction task, such as classification or regression, as if the whole data was used. Utilizing IS reduces an original data set to a manageable volume, which leads to a reduction in the computational resources needed to complete the learning process. As in FS and OD [29], IS methods can be divided into filters, wrappers, and embedded methods. Instance selection methods use information measures or rules to evaluate instance subsets, such as weakness [30], clustering [31], weighting [32], relevance [33], and so on. Instance selection wrapper methods use a classifier to evaluate candidate instance subsets. Most of the IS wrapper methods have been proposed based on the k − NN classifier [34], and this type of instance selection is called prototype selection [35]. Prototype selection methods attempt to select a representative subset of samples from the training set, while prototype generation methods [36] generate a small set of artificial prototypes to replace the original training set. These methods are usually based on misclassification [37–39], although there are others based on associate sets [40] or on support vector machines [41]. As in FS, IS can be considered as a search problem, and forward, backward and mixed search strategies can be applied for IS [42], in addition to metaheuristics such as evolutionary algorithms. Finally, IS methods can also be categorized according to the selection strategy [43] into condensation algorithms that attempt to remove the instances far away from the decision surface, edition algorithms that remove noises to improve the classification accuracy, and hybrid algorithms that remove both central and border instances, and a reduced set can be obtained by merging border and core instances [44]. Instance selection is not only connected to classification tasks, but also to regression tasks [45–47]. Overviews of IS methods can be found in [48–50]. Just as FS and OD, IS tasks can be also combined freely with lag as spatial selection, when necessary, as well as with nonlinear selection.