Considerations for System Design to solve Scalability, Consistency, Partition, and Availability in Distributed and Non-Distributed environments
As a software engineer, you always have the challenge to design a system that is always available, scales well with increasing load, and responds with low latency. The principles used in the system differ from application to application. There is no unique solution for designing the system. It is an open-ended problem with many possible solutions. Even though there are no optimal solutions there are guidelines to tackle each component of System Design. This article will focus mainly on non-functional components of System Design that are Consistency, Availability, Partition-Tolerance, and Scalability. Further, we see how these components work in distributed and non-distributed environments.
CAP theorem stands for Consistency, Availability, and Partition-Tolerance. CAP theorem is the most important guideline for designing distributed systems. As per the CAP theorem, you can't achieve all three together. If you choose Consistency and Availability you will lose Partition-Tolerance. Below we explore each component of the CAP theorem in detail.
The idea behind consistency is that all the clients should see the same data all the time. For example, You are designing a billing system deployed in the shop where the system produces an invoice for customer purchases. Let's say a person buys 3 items and the system produces an invoice for the shopkeeper and customer. The invoice should be the same for both. But let's say the shopkeeper made a mistake while putting the price of one item in a system. The invoice that got created is incorrect. If a customer leaves the store and he edits that invoice, the invoice that the customer has will be different than the one he has. This might create problems for him and the customer when they are going to file taxes or do an audit. So to correct the mistake he has to create another invoice that will have the arrears and give it to the customer when he returns. Such an example shows the transactions are always final. If there is a mistake, you create a new transaction to maintain consistency. Most SQL RDBMS databases are consistent. They also have other properties as below.
How consistency is achieved
The way consistency is achieved is by locking data when writing. If the system is writing a new row in a table or updating an old row it is going to lock that row so now clients can read it until the write is finished. The locking is in place until all transactions happen. This way all clients are going to have exact data when queried. A consistent system has a normalization property. That means we can create relations between data and perform the join operations.
Availability means when a request is made there is at least one node to respond no matter what. The system that is highly available may or may not have ACID property. Highly Available systems use multiple nodes to process or store data. Data is replicated all across nodes through a method called sharding. There could be a central controller or metadata manager that keeps track of data change, indexes to make sure data is distributed evenly across nodes. Examples of Availability systems are distributed caches like Redis, NoSQL Databases like MongoDB
In a partition tolerant system, a request should be fulfilled even if the network between nodes breakdown. In this type, data is split and stored on multiple nodes. Each node has the responsibility to read and write that particular partition. These are types of systems that may or may not have ACID property. A transaction is carried out by multiple nodes separately and then combines the result like map-reduce. An example for this is Cassandra
Why we can't achieve all three
Consistent and Partition-Tolerance Systems
In this type of system, data has to be always consistent or accurate and should be always processed even if there are network failures in between nodes. One Example is Postgres. In Postgres, a table can be partitioned into multiple chunks and each chunk is stored on a different node. If a request comes only the corresponding node will respond. Such systems have difficulty supporting high availability. This is because there requires synchronization between nodes and if we always have to keep nodes synchronized when the network fails nodes are going to be async and we won't be tolerant to network failures, therefore, we lose partition-tolerance capability. These types of systems are used when there is a strong relationship and data is very structured.
Consistent and Availability Systems
As explained before, systems that are consistent and available always need to be in sync so that replication of data is the same all across nodes. Therefore such a system cannot tolerate network failures. Systems like MongoDB support ACID property through sharding. This is used when data is not structured but still needs to be consistent.
Partition-Tolerance and Availability Systems
Here let's take the example of Cassandra. In Cassandra, we can partition the data into multiple chunks and store it on nodes. We also replicate these chunks and store them on other nodes. So when we are writing data, the node that is responsible for that chunk writes it first, and later it is replicated by other nodes. If the read request comes in meantime, the node which is writing the data won't respond but the other replicas will respond. As the other replicas will not have the latest write the client won't get accurate data. That's why such a system lacks consistency and it takes some time to update the systems with the latest copy. Such systems are used for secondary purposes like analytics, map-reduce kind of operations.
So in nutshell, if you are designing a distributed system you should make a decision on what is more important to your application based on the CAP theorem. There are other considerations like security, that are not covered in this article but are important in system design.
Distributed and Non-Distributed System
When the application has to scale with the increase of data, the question arises if you want to increase the capacity or resources of the one server that is hosting(scale vertical) or add one more server and split the data or traffic using a load balancer(scale horizontally). Both ideas have tradeoffs and suit a different set of applications.
Vertical Scaling for Monoliths
Monolith applications are the ones that carry out all functionality. It is a perfect example of non-distributed systems and vertical scaling fits it better. In such systems, it is difficult to split the functionalities into small services. One reason is you lose consistency and other functions are highly dependent on one another. So it makes sense to have the application run on one server and increase the memory, CPU, and storage of that server with the increase in load. Examples of such systems are ordering or payment services, AI, etc.
Horizontal Scaling for Microservices
Microservices are distributed systems, a single application is split into different functionalities that are more or less dependent on each other, and splitting them doesn't affect the performance. Each functionality called service can be deployed on one server. Because one server is handling one service it doesn't require high-end servers. As load increases, more such servers can be deployed and load can be distributed. Microservices are popular for designing distributed cache, social apps.
Trade-off between Monoliths and Microservices