Skip to content

Data Lake

A storage layer where structured and unstructured data is kept in raw or lightly processed form at scale.

A data lake enables more flexible storage of incoming data from multiple sources without imposing strict schema requirements upfront. It is especially advantageous for large-scale raw data, logs, media, sensor streams, and document collections. However, a data lake is not just cheap storage; if poorly managed, it can quickly turn into a data swamp. For that reason, discoverability, metadata, and quality controls must be integral parts of the architecture.