One of the core tenets of distributed systems is that the network is not reliable. We all know that - and we all have experienced it, but we seem to be overlooking it when we build software.

What does it actually mean?

Clients are on low network connectivity - so requests timeout
A data packed was lost during transmission - so it must be re-transmitted
Servers are not able to connect to the database - so errors are bubbling up to the client
A consumer took too long to reply - so the broker decided to rebalance the consumer group
A message was not ack'ed in time - so it is re-delivered to a different consumer

Dealing with this sort of failure scenarios must be top of mind whenever we build software that runs in a distributed fashion - and nowadays - it is the essence of the WEB.

To make sure our systems are resilient to failure we build guardrails:

if a request timed-out - we retry the same request;
if a message was not ack'ed in a reasonable amount of time - the message is re-delivered;
and so on...

So, what does idempotence have to do with any of that? you might ask.

Simple - it lets us safely and without any worry retry the action we wanted to take.

Some actions are inherently idempotent - e.g. deletion, the abs() function - others, we have to make idempotent to ensure the consistency of our system.

Examples

PUT and DELETE Requests

In RESTful APIs, the HTTP methods PUT and DELETE are designed to be idempotent. A PUT request updates a resource to a specified state, and multiple identical PUT requests result in the same state.

Similarly, a DELETE request removes a resource, and repeating the request has no additional effect.

Importance: Ensures that repeated requests due to network retries do not cause unintended changes or errors.

Implementation Notes:

Use resource identifiers in your URLs (e.g., /users/{id} rather than /users/create)
For PUT requests, ensure your update logic replaces the entire resource rather than applying incremental changes
Include version information or ETags to detect concurrent modifications
Return appropriate status codes (e.g., 200 OK for successful PUT, 204 No Content for DELETE)

Configuration Management

Tools like Terraform and Ansible are designed to be idempotent. Applying the same configuration script multiple times should not change the state of the infrastructure after the first application.

Importance: Allows safe reapplication of configurations to ensure systems are in the desired state without causing disruptions.

Implementation Notes:

Use declarative configurations that specify the desired end state rather than imperative commands
Implement proper dependency tracking to ensure resources are created in the correct order
Always use conditional checks before taking actions (e.g., "create only if not exists")
Design your modules to safely handle partial failures and retries
Test your configurations by applying them multiple times to verify they don't cause unexpected changes

Message delivery

In distributed messaging systems, ensuring idempotent message processing means that even if a message is delivered more than once, it is processed only once.

Importance: Avoids duplicate processing which can lead to inconsistent data states or actions being triggered multiple times.

Implementation Notes:

Include a unique message ID with each message to identify duplicates
Implement a deduplication store (e.g., database table, Redis cache) to track processed message IDs
Use database transactions to atomically process messages and record their IDs
Design message handlers to be safe for repeated execution (avoid incrementing counters directly)
Consider time-based expiration for your deduplication records based on your system's retry policies
For critical systems, implement a reconciliation process to detect and fix inconsistencies

Conclusion

Understanding and implementing idempotency is crucial for building robust and reliable distributed systems. As we've seen, it's a concept that spans various aspects of software development, from API design to infrastructure management and message processing.

We encourage you to delve deeper into the principles of idempotency and consider the potential pitfalls of neglecting it. Think about the systems you build or interact with:

What happens if an operation is unintentionally repeated?
Could it lead to data corruption, incorrect financial transactions, or other undesirable side effects?
How can you design your components to gracefully handle retries and ensure consistent outcomes?

By proactively addressing idempotency, you can significantly enhance the resilience and predictability of your applications, ultimately leading to a better user experience and more stable systems.

Additional Resources

To further explore the concept of idempotency and its practical applications, consider the following resources:

Designing Data-Intensive Applications by Martin Kleppmann - An excellent book covering idempotency in the context of distributed systems
Idempotent Receiver Pattern - A pattern for handling message processing idempotently
Stripe API: Idempotent Requests - How Stripe implements idempotency keys in their payment API
AWS: Implementing reliable AWS Lambda retry handling - A practical guide on implementing idempotent Lambda functions
Idempotency with Kafka Transactions - How to implement exactly-once semantics in Kafka

Idempotent APIs

Examples

PUT and DELETE Requests

Configuration Management

Message delivery

Conclusion

Additional Resources