Skip to content

Target Encoding

An advanced feature engineering technique that converts categorical levels into numerical representations using target-related summary statistics.

Target encoding is an advanced technique often used to avoid the dimensional explosion created by one-hot encoding in high-cardinality categorical variables. Each category is transformed into a numerical value based on summary statistics related to the target. This can produce a strong signal, but if implemented incorrectly it can create serious leakage. For that reason, safe encoding usually requires cross-validation-based strategies. When done properly, it can provide major gains in both performance and efficiency.