Skip to content

Idempotency

The property of producing a stable, non-duplicated result even when the same operation is run multiple times.

Idempotency is essential for reliable reruns in data pipelines. When a job fails and is triggered again, the system must not create duplicate records or corrupt the data. This becomes especially critical in distributed systems, batch backfills, and stream redelivery scenarios. Non-idempotent data flows often create silent but destructive quality problems.