Skip to content

Outlier

An observation or value that deviates noticeably from the general pattern of the dataset.

Outliers are observations that require special attention in both cleaning and modeling workflows. Not every outlier is an error; some represent rare but real events, while others reflect data entry, measurement, or integration mistakes. For that reason, detecting an outlier should not lead automatically to deletion. Business context, distributional shape, and model purpose must be evaluated together. If handled poorly, outliers can distort both summary statistics and model behavior.