Skip to main content

Event Driven Architecture - SAGA Pattern (Part-1 : Choreography Model)

The Saga pattern is a distributed transactional pattern used in microservices architecture to maintain data consistency across multiple services. It helps manage long-running transactions involving multiple services by breaking them down into smaller, more manageable work units.

There is a famous Database per Service pattern in the Microservice world. Under this paradigm, each service maintains its own dedicated database. Some business transactions, however, span multiple services so we need a mechanism to implement transactions that span through services. Take, for instance, the scenario of placing an online order, which involves actions like inventory verification and item reservation till payment completion. Since services such as Orders, Inventory, and Payment operate on separate databases, the application cannot simply use a local ACID transaction.

2 Phase Commit Protocol is one of the options being used for ensuring transactions across services. However, it has several challenges:

  • Blocking can occur if any participant fails to respond during the prepare phase.
  • A Single Point of Failure may arise if the coordinator fails, leading to the potential blockage or failure of the entire transaction.
  • It does not effectively scale for a significant number of participants.
The Saga pattern facilitates transaction management by orchestrating a series of local transactions for each participating microservice.
  • Each microservice has its own database, enabling it to manage local transactions atomically with strong consistency.
  • Saga pattern groups these local transactions and sequentially executes them one by one. Each local transaction updates its database and then publishes an event to trigger the next local transaction.
  • In case of a failure in one of the steps, the saga pattern initiates rollback transactions. These rollback transactions comprise a series of compensating actions that undo the changes made by previous microservices, thereby restoring data consistency.
There are two common Saga implementation models, Choreography and Orchestration Model. Each approach has its own set of challenges and technologies to coordinate the workflow.

1. Choreography Saga Model:

The Choreography Model orchestrates sagas using Publish-Subscribe principle. Each microservice executes its own local transaction and publishes events to the message broker system, which in turn triggers local transactions in other microservices. The detailed workflow is outlined below:

Happy flow when a user places an order:

  1. The user sends an Order Request 
  2. Order Service initiates the creation of an order history entry in its Orders database, setting the order status to "Pending"
  3. Order Service then publishes a message/event onto the Inventory_Request topic, signaling the next service in the chain.
  4. The Inventory Service then receives the event from the Inventory_Request topic along with relevant metadata.
  5. Inventory Service then proceeds to update its Inventory table accordingly. It verifies if it has the available stock. If available, reserve it till the payment is successful.
  6. Subsequently, it publishes another event onto the Payment_Request topic for the subsequent service to process.
  7. The Payment Service then receives the event from the Payment_Request topic along with relevant metadata. 
  8. Upon receipt of the event from the Payment_Request topic, the Payment Service executes its payment processing actions and updates its Payment database accordingly.
  9. If the payment is successful, the Payment Service publishes a success event onto the Payment_Request_Success topic.
  10. The Inventory Service, monitoring the Payment_Request_Success topic, takes further action on its Inventory table based on the success event.
  11. Inventory Service then publishes a success message onto the Inventory_Request_Success topic.
  12. Finally, the Order Service, upon detecting the success message on the Inventory_Request_Success topic, updates the order status from "Pending" to "Success", thus marking the order as successfully processed.

An error encountered when a user places an order:

Let's understand how the Choreography Saga model triggers rollback by creating compensating transactions that undo the changes on previous microservices and restore data consistency. Say an error is encountered on the Payment Service leading to a payment failure. The Inventory and Order Service should perform compensating actions due to failed payment.

  1. The user sends an Order Request  (Note: First 7 steps are the same as happy case)
  2. Order Service initiates the creation of an order history entry in its Orders database, setting the order status to "Pending"
  3. Order Service then publishes a message/event onto the Inventory_Request topic, signaling the next service in the chain.
  4. The Inventory Service then receives the event from the Inventory_Request topic along with relevant metadata.
  5. Inventory Service then proceeds to update its Inventory table accordingly. It verifies if it has the available stock. If available, reserve it till the payment is successful.
  6. Subsequently, it publishes another event onto the Payment_Request topic for the subsequent service to process.
  7. The Payment Service then receives the event from the Payment_Request topic along with relevant metadata. 
  8. Say an error encountered while processing the payment, the Payment Service will update its database for the failed payment transaction.
  9. In the event of a payment failure, the Payment Service logs a failed event onto the Payment_Request_Failed topic.
  10. Subsequently, the Inventory Service retrieves this event from the Payment_Request_Failed topic and undertakes compensatory action, reverting the inventory count to its previous state, updated during Step 5.
  11. After executing the compensation action, the Inventory Service publishes a failed message onto the Inventory_Request_Failed topic.
  12. The Order Service, upon detecting the failed message on the Inventory_Request_Failed topic, updates the order status from "Pending" to "Failed", thereby marking the order as unsuccessful.
In the next part, we'll delve into the Orchestration Saga Model and explore the scenarios where each model is best suited

Comments

Popular posts from this blog

CAP Theorem - Debunking Myths

The CAP theorem is a widely recognized idea in the field of distributed systems. It represents three key concepts: Consistency, Availability, and Partition Tolerance. While most of us are familiar with its definition, the devil lies in the details. In this discussion, we'll clarify common myths and misunderstandings. We'll start by explaining the CAP theorem in detail, and then explore various scenarios that may challenge the common interpretation of it. CAP theorem also known as Brewer's theorem states that any distributed data store can provide only two of the following three guarantees: Consistency:  For every read request, the system should provide the most recent write or an error. Note that this consistency is different from the consistency of the  ACID theorem Availability:   For every request, the system should provide a response, even if it’s not the latest data.  In other words, all non-failing (healthy) nodes in the distributed system return a valid ...

Understanding Merkle Tree

A Merkle Tree is a cryptographic tree structure used in computer science and distributed systems to efficiently verify the integrity of large sets of data (accuracy and consistency of data over its lifecycle).  Merkle Tree, also known as Hash Tree is a tree of hash values.  It has a tree structure in which each leaf node is a hash of a small portion of the data, and each non-leaf node is a hash of its children. It is used in applications such as  NoSQL databases, Git, Cryptocurrencies,  File Systems, etc. Some key characteristics of Merkle Tree are: Binary Tree Structure:  The Merkle Tree is a binary tree, where each leaf node represents a hash of data. Leaf Nodes: The data set is divided into fixed-size blocks or "leaves". Each leaf node contains the hash of a specific data block or piece of information. Non-Leaf Nodes: Non-leaf nodes in the tree represent the hash of the concatenation of their child node's hashes.  If the number of leaves is odd...