diff --git a/Outlier Detection/Readme.md b/Outlier Detection/Readme.md index 8b13789179..0752934033 100644 --- a/Outlier Detection/Readme.md +++ b/Outlier Detection/Readme.md @@ -1 +1,36 @@ +### Outliers in Data + + + +An outlier is a data point that significantly deviates from the majority of the data, either being much higher or lower. Outliers can result from measurement or execution errors and can significantly impact machine learning algorithms. Analyzing outliers is known as outlier analysis or outlier mining. + + + + +### Importance of Outlier Detection in Machine Learning + + + + +1. **Biased Models**: Outliers can skew a model towards these extreme values, leading to poor generalization. + +2. **Reduced Accuracy**: Outliers introduce noise, complicating the detection of true data patterns and reducing model accuracy. + +3. **Increased Variance**: Outliers increase model sensitivity to data changes, reducing stability and reliability. + +4. **Reduced Interpretability**: Outliers obscure model insights, making predictions less trustworthy and impeding performance improvements. + + + + +### Techniques for Outlier Detection + + + + +1. **Percentile Method**: Identifies outliers based on their position relative to a certain percentile range. + +2. **Z-Score Method**: Detects outliers by calculating how many standard deviations a data point is from the mean. + +3. **IQR Method**: Uses the interquartile range to identify outliers as data points lying outside 1.5 times the IQR above the third quartile or below the first quartile.