Outlier Detection with Parametric and Non-Parametric methods
Dealing with Outliers is like searching a needle in a haystack
by Jacob Joseph.
An Outlier is an observation or point that is distant from other observations/points. But, how would you quantify the distance of an observation from other observations to qualify it as an outlier. Outliers are also referred to as observations whose probability to occur is low. But, again, what constitutes low?? There are parametric methods and non-parametric methods that are employed to identify outliers. Parametric methods involve assumption of some underlying distribution such as normal distribution whereas there is no such requirement with non-parametric approach. Additionally, you could do a univariate analysis by studying a single variable at a time or multivariate analysis where you would study more than one variable at the same time to identify outliers. The question arises which approach and which analysis is the right answer??? Unfortunately, there is no single right…