Skip to main content

CAP Theorem - Debunking Myths

The CAP theorem is a widely recognized idea in the field of distributed systems. It represents three key concepts: Consistency, Availability, and Partition Tolerance. While most of us are familiar with its definition, the devil lies in the details. In this discussion, we'll clarify common myths and misunderstandings. We'll start by explaining the CAP theorem in detail, and then explore various scenarios that may challenge the common interpretation of it.

CAP theorem also known as Brewer's theorem states that any distributed data store can provide only two of the following three guarantees:

  • Consistency: 
    • For every read request, the system should provide the most recent write or an error.
    • Note that this consistency is different from the consistency of the ACID theorem
  • Availability:  
    • For every request, the system should provide a response, even if it’s not the latest data.
    •  In other words, all non-failing (healthy) nodes in the distributed system return a valid response(might not be the most recent) for any request, without exception.
  • Partition Tolerance:
    • It consists of two parts: "Partition" and "Tolerance".
    • Partition refers to a situation where the network is split due to network failure, software glitch, or a broken cable (h/w failures). The nodes (computers) in the network can no longer communicate with each other. 
      •  For example, in a cluster of 4 nodes, the network could be split into two groups. The nodes within each group can communicate, but they can't communicate with the nodes in the other group.
    • Partition Tolerance means that the system can still function even when such a network split happens.
      • More specifically, the system keeps running even if messages between nodes are lost or delayed by the network.
Based on the above definition, one can build three types of distributed systems: CA (consistent and available, but not tolerant of partitions), CP (consistent and tolerant of network partitions, but not available), and AP (available and tolerant of network partitions, but not consistent). 

To understand why the system can't have Consistency, Availability, and Partition Tolerance, let’s imagine a system that tries to achieve all three. Suppose we have a 4-node cluster, and a network partition occurs, splitting the cluster into two separate groups. The current value of a field (K1) is V1 in all the four nodes.
  • A new write request comes and goes to the first group, updating K1 to a new value (V2). The nodes in the first group can still communicate with each other, so they all update K1 to V2.
  • However, due to network partition,  the second group is cut off from the first. Any read requests coming to the second group will still see the older value of  K1 (V1).
  • This creates a problem with Consistency, as the system now has two different values for the same field. It’s also not eventually consistent because the two groups can't communicate to resolve the difference due to the network partition.
Therefore, the system cannot always achieve all three properties (Consistency, Availability, and Partition Tolerance) at the same time. The CAP theorem becomes relevant specifically in the presence of a network partition. In such cases, the system has to make a choice:
  1.  The system can cancel the operation, which will reduce availability but maintain consistency. For example, in the previous scenario, the system could decide that only one group of nodes remains Active(for accepting requests), and errors out the requests for the other group. This keeps the data consistent but reduces the system’s availability because some requests are denied. 
  2. The system can continue with the operation, maintaining availability but risking inconsistency, as different parts of the system might have different data values during the partition.
Now, after gaining sufficient background on the definition of the CAP theorem, let's try to probe the boundaries of the CAP theorem in the typical question-answer form:
  1. Does CAP mean you can only have 2 out of the 3 properties (Consistency, Availability, Partition Tolerance)?
    • This is a common misunderstanding. CAP doesn’t mean you can only choose two and abandon the third entirely. It means you have to choose between Consistency and Availability in the event of a network partition.
  2. If a system is partition-tolerant, does that mean partitions will never cause issues?
    • Being partition-tolerant means the system continues to operate during partitions, but it doesn’t eliminate the challenges of ensuring consistency or availability.
  3. Can you give an example of a system that is NOT partition-tolerant?
    • A system that is not partition tolerant cannot continue functioning properly if there is a disruption in communication between parts of the system (a network partition). 
    • An example could be a traditional centralized database running on a single server. This type of system is not partition tolerant because it relies entirely on a single node or a network without the ability to handle communication failures between distributed components.
    • Another example could be a system with multiple nodes (servers) that use synchronous replication to maintain strong consistency across all nodes. In this system:
      • Every write must be acknowledged by all nodes before the request is considered complete.
      • If even one node fails to acknowledge the update, the entire system waits, or the request fails. The system halted during a network partition and thus is not partition-tolerant.
  4. How do we make the above multi-node system partition tolerant?
    • To make a multi-node system partition tolerant, the system must be able to continue operating even when some nodes are isolated due to network issues. In the context of the CAP theorem, achieving partition tolerance typically involves trading off either consistency or availability.
    • These are some of the common strategies to make the system partition tolerant:
      1. Instead of strong consistency, the system can adopt an eventual consistency model. This allows each node to make independent decisions during a partition, and the system will reconcile the data when the partition is resolved.
        • The system can handle network partitions and continue operating independently in each partitioned group.
        •  Each node can continue serving both read and write requests during a partition, ensuring the system remains available.
        • Clients may read outdated or stale data until the partition is resolved and the nodes synchronize.
      2. Using Asynchronous Replication, the primary node can immediately accept and process a write request without waiting for acknowledgment from other nodes. The write is propagated to the other nodes in the background when the network becomes available again.
        • The system operates even when nodes can’t communicate with each other.
        • The system can continue accepting writes (preferred Availability) and serving requests, even if some nodes are isolated due to a partition.
        • The system might have inconsistent data during the partition.
      3. Another approach is to use a quorum-based system where the system requires responses from a majority (or quorum) of nodes before accepting a write or read operation.
  5. Is Partition Tolerance optional?
    • No, partition tolerance is a necessity in distributed systems. This means that the system must handle network partitions, which are common in real-world scenarios. The question becomes whether the system should prioritize consistency or availability during a partition.
  6. If Partition Tolerance is not optional, CA systems should also be “not tolerant of network partitions”. What is a CA system then?
    • Yes right CA systems are not tolerant to network partitions. In practice, it means that they lose availability if there is a partition. Hence CP and CA are essentially identical. So in reality, there are only two types of systems: CP/CA and AP.  I.e., if there is a partition, the system gives up availability or consistency. Having three letters in CAP and saying you can pick any two does nothing but confuse(at least confused me). 
  7. If a system is 99.999% available, is it effectively "highly available" according to CAP?
    • CAP’s definition of availability is strict: every request to a non-failing node must get a response. So even a system with 99.999% uptime might not qualify as "highly available" in CAP terms if it fails to respond during partitions. CAP’s availability is all-or-nothing in the face of a network partition.
  8. Availability definition is "Every request received by a non-failing node in the system must result in a response". Does it mean, that even if the system responds with an error like Timeout Error or Service Unavailable, the system can be called Available?
    • No, that is not the case. The system will be marked as not Available. Wikipedia says and I quote "When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network". This means during a network partition, the Availability of the system is compromised by sending an error. 
  9. Can a system be eventually consistent and still satisfy the CAP theorem’s consistency requirement?
    • No, the CAP theorem’s "Consistency" refers to strong consistency, meaning all nodes see the same data at the same time. Eventual consistency doesn’t meet CAP’s strict consistency condition but is often sufficient for practical purposes, causing confusion about whether it still "breaks" the CAP theorem.
  10. Does relaxing consistency always result in eventual consistency?
    • Not necessarily. Relaxing consistency might lead to eventual consistency, but it could also result in inconsistent data that never converges without extra mechanisms like conflict resolution strategies, auto synchronization, versioning, timestamping, etc
  11. Is it possible to achieve strong consistency during a network partition with a majority quorum?
    • Systems using quorums (like Cassandra) try to balance consistency and availability by reading and writing from a majority of nodes, but they still face trade-offs during network partitions.
    • Using a majority quorum approach (like in Paxos or Raft), you might think consistency can be maintained during partitions. However, this comes at the cost of availability—nodes that don’t have a majority cannot serve requests. So, this method doesn’t violate CAP but reinforces the trade-off.
  12. Can CAP be applied to microservices, or is it just for distributed databases?
    • CAP is most commonly applied to distributed databases, but its principles also apply to microservices architecture.
  13. What happens if a system ignores partition tolerance—does CAP still apply?
    • If a system ignores partition tolerance, the CAP theorem still technically applies, but the system is at risk of failure or significant issues when a network partition (communication failure) occurs. Ignoring partition tolerance may work in environments with highly reliable networks, but real-world distributed systems are prone to partition events. 
    • When Can Partition Tolerance Be Ignored?
      • If a system operates entirely within one data center or on a single node, the risk of network partitions is extremely low. Communication between components is reliable and fast. Here, you can prioritize Consistency and Availability without worrying about network partitions. Example: Traditional relational databases (like MySQL or PostgreSQL) running on a single server don’t need to worry about partition tolerance because there’s no network partition in play.
      • In environments where the network is exceptionally reliable and monitored, the chance of partitions is near zero. These could be systems in highly controlled settings, like military or financial institutions with dedicated, high-reliability infrastructure. Google Spanner also leverages this as Google controls its entire network and thus can ensure redundancy of hardware and paths
      • Some applications are tightly coupled and deployed in environments where communication is direct and within the same system (e.g., on the same machine or network switch). For such systems, partitions are practically impossible, allowing partition tolerance to be ignored. Example: A local file system or memory-based data cache (like Redis on a single server) doesn't have to account for partition tolerance, since all operations occur in a single environment.

References:

Comments

Popular posts from this blog

Understanding Merkle Tree

A Merkle Tree is a cryptographic tree structure used in computer science and distributed systems to efficiently verify the integrity of large sets of data (accuracy and consistency of data over its lifecycle).  Merkle Tree, also known as Hash Tree is a tree of hash values.  It has a tree structure in which each leaf node is a hash of a small portion of the data, and each non-leaf node is a hash of its children. It is used in applications such as  NoSQL databases, Git, Cryptocurrencies,  File Systems, etc. Some key characteristics of Merkle Tree are: Binary Tree Structure:  The Merkle Tree is a binary tree, where each leaf node represents a hash of data. Leaf Nodes: The data set is divided into fixed-size blocks or "leaves". Each leaf node contains the hash of a specific data block or piece of information. Non-Leaf Nodes: Non-leaf nodes in the tree represent the hash of the concatenation of their child node's hashes.  If the number of leaves is odd...

Event Driven Architecture - SAGA Pattern (Part-1 : Choreography Model)

The Saga pattern is a distributed transactional pattern used in microservices architecture to maintain data consistency across multiple services. It helps manage long-running transactions involving multiple services by breaking them down into smaller, more manageable work units. There is a famous Database per Service  pattern in the Microservice world. Under this paradigm, each service maintains its own dedicated database. Some business transactions, however, span multiple services so we need a mechanism to implement transactions that span through services. Take, for instance, the scenario of placing an online order, which involves actions like inventory verification and item reservation till payment completion. Since services such as Orders, Inventory, and Payment operate on separate databases, the application cannot simply use a local ACID transaction. 2 Phase Commit Protocol  is one of the options being used for ensuring transactions across services. However, it has se...