Skip to content
Technical GlossaryNatural Language Processing

Stopword Filtering

A classical preprocessing technique based on removing frequent words that are assumed to have low semantic contribution.

Stopword filtering is often used to reduce dimensionality in representations such as bag-of-words, TF-IDF, and classical topic modeling. However, it does not always provide automatic benefit in modern NLP, because conjunctions, negation forms, and function words can be semantically crucial in some tasks. For that reason, stopword usage should be task-driven. Blind filtering may cause serious information loss in domains such as sentiment analysis and legal text.