Technical GlossaryData Science and Data Management
Canonicalization
The process of converting different representations of the same information into one standard canonical form.
Canonicalization is used in data cleaning to enforce representational consistency. Converting date formats into a single standard, normalizing phone numbers, or unifying address abbreviations are common examples. Without this step, the same information may appear technically different and behave like separate entities in analytics systems. Canonicalization is especially valuable in matching, key generation, and data integration workflows. Clean data often begins with standardized representation.
You Might Also Like
Explore these concepts to continue your artificial intelligence journey.
