Algorithms For Outlier Selection
Identifying and removing outliers is challenging with simple statistical methods for most machine learning datasets given the large number of input variables.
Algorithms for outlier selection. It is therefore necessary to look for robust methods which do not require a priori knowledge of time series and may not dependent on number nature of outliers. Smearing means that one outlier makes another non outlier observation appear as an outlier and masking that one outlier prevents another one from being detected. This article addresses some problems in outlier detection and variable selection in linear regression models.
Related work various techniques have been proposed for outlier detection and most of these work basically used statics. The presence of outliers in a classification or regression dataset can result in a poor fit and lower predictive modeling performance. First in outlier detection there are problems known as smearing and masking.
The algorithm is based on the idea that outliers represent data points that are few and different. Researchers are still doing research on determining outliers using genetic algorithms gas. Instead automatic outlier detection methods can be used in the modeling pipeline and compared just.
Whereas optimization techniques are the basis for machine learning algorithms and selection of an improper algorithm may result in misleading outcome.