Skip to content

Undersampling

An approach that reduces the number of majority-class examples to produce a more balanced class distribution.

Undersampling aims to make the model take the minority class more seriously by reducing the number of majority-class examples. This can also lower computational cost and, in some cases, produce cleaner decision boundaries. However, it carries a risk of information loss, since removing majority-class examples may also remove important patterns. For that reason, undersampling should ideally be performed with structure-aware and information-preserving strategies rather than randomly. It can be seen as a fast but delicate balancing tool.