Skip to main content

Posts

Showing posts from 2024

CAP Theorem - Debunking Myths

The CAP theorem is a widely recognized idea in the field of distributed systems. It represents three key concepts: Consistency, Availability, and Partition Tolerance. While most of us are familiar with its definition, the devil lies in the details. In this discussion, we'll clarify common myths and misunderstandings. We'll start by explaining the CAP theorem in detail, and then explore various scenarios that may challenge the common interpretation of it. CAP theorem also known as Brewer's theorem states that any distributed data store can provide only two of the following three guarantees: Consistency:  For every read request, the system should provide the most recent write or an error. Note that this consistency is different from the consistency of the  ACID theorem Availability:   For every request, the system should provide a response, even if it’s not the latest data.  In other words, all non-failing (healthy) nodes in the distributed system return a valid ...

Event Driven Architecture - SAGA Pattern (Part-1 : Choreography Model)

The Saga pattern is a distributed transactional pattern used in microservices architecture to maintain data consistency across multiple services. It helps manage long-running transactions involving multiple services by breaking them down into smaller, more manageable work units. There is a famous Database per Service  pattern in the Microservice world. Under this paradigm, each service maintains its own dedicated database. Some business transactions, however, span multiple services so we need a mechanism to implement transactions that span through services. Take, for instance, the scenario of placing an online order, which involves actions like inventory verification and item reservation till payment completion. Since services such as Orders, Inventory, and Payment operate on separate databases, the application cannot simply use a local ACID transaction. 2 Phase Commit Protocol  is one of the options being used for ensuring transactions across services. However, it has se...

How to Choose Right Database - Part - 2 (SQL v/s No-SQL)

We selected the database based on specific use cases in the first part. This section will be more interesting where we will discuss SQL and No-SQL databases, and what are the factors that influence their selection.  SQL Database: SQL (Structured Query Language) databases are relational database management systems (RDBMS) that use a structured query language (SQL) for defining, querying, and manipulating data. They organize data into tables with rows and columns, enforcing a predefined schema. SQL databases are characterized by their adherence to ACID (Atomicity, Consistency, Isolation, Durability) properties, making them suitable for applications requiring strict data integrity and complex transactions. No-SQL Database: NoSQL (Not Only SQL) databases are a broad category of database management systems that provide mechanisms for storage and retrieval of data modeled in formats other than traditional tabular relations used in relational databases.  NoSQL databases offer flexi...

HTTP Series - Part-2: HTTP/2.0

Continuing our journey of the HTTP series, the previous part explored into the progression of HTTP leading up to HTTP/1.1. In this segment, we will begin by examining certain limitations of HTTP/1.1, and subsequently, delve into the intricacies of HTTP/2.0, which looks to overcome these limitations. Limitations of HTTP/1.1 that lead to HTTP/2.0: Head-of-Line Blocking: In HTTP/1.1, multiple HTTP requests are transmitted over a single TCP connection. However, the caveat is that they are sent sequentially. In other words, the next HTTP request cannot be sent until the response to the current HTTP request is received. If getting one of the resources is delayed, subsequent resources are also blocked, even if they are independent and could be fetched more quickly. Limited Multiplexing: The sequential execution of HTTP requests over a TCP connection introduces latency, especially when fetching numerous resources. To mitigate this, modern browsers employ a workaround, allowing a maximum of 6 ...

HTTP Series - Part-1: (TCP, HTTP and HTTP/1.1)

Hypertext Transfer Protocol (HTTP) enables the transfer of data over the Internet. HTTP is an application-layer protocol that facilitates the transmission of hypermedia documents, such as HTML. It was designed for communication between web browsers and servers but can also be used for other purposes. I have discussed in detail about different  modes of communication between a client and a server. They all use HTTP under the hood. HTTP is a “stateless” protocol, which means each request is executed independently, without any knowledge of the requests that were executed before it. It uses the underlying transport protocol TCP (Transmission Control Protocol) to establish and manage connections between a client and a server.  Transmission Control Protocol (TCP): TCP is a connection-oriented transport layer protocol. It provides a fully duplex and reliable exchange of messages between different devices over a network. Some of the main features of TCP are: Reliability: TCP ens...

How to Choose Right Database in System Design Interviews - Part - 1

Choosing the right database in the system design interviews is one of the most important decisions. The database selection impacts the design correctness and scalability of the system. There are many factors which might impact the decisions, but the most important ones are: Structure of the Data: Different databases support data models such as relational, document, key-value, graph, etc. The choice of database depends on which data model best fits the structure of your data. For example, a traditional relational database like MySQL or PostgreSQL might be appropriate if the data is highly structured and relational. A NoSQL database like Cassandra or MongoDB might be more suitable if the data is semi-structured or unstructured. Query Pattern: Understanding the types of queries and operations performed on the data helps select a database that can efficiently handle those operations. Are they read-heavy or write-heavy? Do they involve complex joins, aggregations, or full-text searches?...