Scaling Concepts

Concept: Vertical Scaling

What is Vertical scaling ?

vertical scaling or scaling up is a process of increasing the capacity of a single server or resource by adding more power to it.we called this scale up/vertical scaling because in this we are increasing powers to the same servers.

Main dimensions which is increased during this process is:

Virtual CPU, RAM/Memory, Block Storage, Network Bandwidth.

process we followed:

Backup Data: In this step we make sure that our existing data should not be lost for this we did nothing because all the data is in block storage and we can use the same block storage with new virtual machine.
Plan Downtime Window: so here we decide the timings when we will do this process and also notify the customers about that.
Change Instance Type: Now in this step we came up with the instance type which provide more VPCU, Memory , Network Bandwidth.
Restart Instance: for changing instance type we stopped the instance and once changed we restart the server.
Run Test: Now simply run some test(Automation test) and check again the metrics of the server.

Situation
Issue
Task
Action

Let's understand the situation

so now after scaling things up we can process more requests out number of number of vCPU is increased.
all the metrics in dashboard is under the threshold limit.

Hence here comes the next `system design concept` named Horizontal Scaling.

Concept: Horizontal Scaling

What is Horizontal scaling ?

horizontal scaling or scaling out is a method of increasing the capacity or performance of a system by adding more hardware or resources to distribute the load. we called this scale out/horizontal scaling because in this we are increasing adding more servers parallel to existing servers.

process we followed:

we simply start with creating some general purpose instances,deploy application to all the servers with databases.
we then added there public IP's in DNS System.
now as there are more then one server we need to put logging and monitoring stack to all the server.

centralized logging: fluent bit agent is installed on all the servers which will push logs to cloudwatch.
centralized metrics collection: added prometheus and node exporters to all the server to push metrics to blob store like s3.

Issues from vertical scaling which is addressed by horizontal scaling method:

Downtime
Single Point of faliure
Provide Elasticity
Cost effective

Situation
Issue
Task
Action

Let's understand the situation

so now we have multiple application server to handle the incoming request in distributed manner.
so whenever we need to scale we just need to add the new application server and add this to DNS management dashboard.
DNS will route the traffic to application server based on there turn in round robin fashion.

Now let's understand the current issue which we have observed.

first most important issue is everytime we create a new application server we need to list there IP in DNS dashboard and it will take some time for DNS to make these changes.
second issue is DNS will not route the traffic based on health of the server.
we don't have flexibility to route traffic to application server based on logic, it simply route traffic based on Round robin fashion.

Hence here comes the next `system design concept` named Load Balancer.

Concept: Load Balancer

A load balancer is a networking device or software application designed to distribute incoming network traffic across multiple servers or resources. The primary purpose of a load balancer is to ensure that no single server bears too much traffic load

Process we followed for utilizing load balancer in our architecture:

Health Check Endpoint: as you know that in each application server we have reverse proxy(nginx), backend and frontent is there. so in reverse-proxy/nginx config we have to add /health-check API which can be used by load balancer to determine availability of a server.
Register Instances: register all the instances among which load balancer will route the traffic.
Configure Load Balancing Algorithm: there are some common algorithms which is used to select the application server to route traffic.
SSL Certificate: in this step we need to add the SSL certificate and use HTTPS protocol for encrypting the request.

after adding load balancer we can make all application server to be private and make load_balancer as public facing and add IP of load balancer to DNS else remove everything from DNS dashboard.

Situation
Issue
Task
Action

Let's understand the situation

so far everything works good we have load balancer which helps us to manage load among different application servers.
so if we have to scale we simply spin new application server run the script to install the application and register this server to notify load balancer to consider this new machine as well.

Now let's understand the current issue which we have observed.

as each application server consist of database as well so when a user hits a request it can be fulfil by any server.
which means if request of user A is fulfilled by server A then all the data related to that user is stored in server A.
but now load balancer can route the request to any server that means if request of user A is now fulfilled by server B did't return any data which belongs to user A.

Hence here comes the next `system design concept` named Stateless Web Tier.

Stateless Web Tier

From above issue we can learn that web server or server which consist of application(frontend, backend) should not store any data/state and hence Web server should be stateless.

Three Tier architecture

From above 3 tier system we can see that web tier and data tier is separate thing which promote scalability without any data inconsistency.

Situation
Issue
Task
Action

Let's understand the situation

so far everything works good we have separated both application and database layer.

Now let's understand the current issue which we have observed.

one thing we have observed is that with increase in traffic we can easily scale by adding more application server in the pool.
but what about database we are still using single database to fulfil all the database queries.

Hence we need to scale `database layer` as well.

Hence here comes the next `system design concept` named Database Scaling.

Database Scaling

The current design faces limitations with a single database, lacking failover support, and struggling to meet database requests from all application servers. To address these challenges, various techniques can be employed to scale the database.

Common Scaling Technique: Database Replication

One of the most widely adopted techniques for scaling databases is Database Replication. This approach involves the creation of a master node (write replica) and multiple slave nodes (read replicas).

Master Node (Write Replica):
- This serves as the primary database instance where write operations occur.
- It is responsible for handling data modifications and updates.
Slave Nodes (Read Replicas):
- Multiple read replicas are created to distribute the read workload.
- These replicas mirror the data from the master node.
- They can be strategically placed to serve specific application servers or geographical regions.

Failure Scenarios:

Master Node Failure:
- If the master node goes down:
  - A failover mechanism should be in place to promote one of the read replicas to become the new master.
  - The application servers can then redirect write requests to the new master node.
  - This ensures minimal downtime and maintains write functionality.
Slave Node Failure:
- If a read replica (slave node) goes down:
  - Read traffic can be redirected to other available replicas to ensure continued service.
  - Data redundancy in multiple replicas mitigates the impact of a single node failure on read operations.
  - Regular monitoring and automated recovery processes can be implemented to replace or repair the failed replica.

Situation
Issue
Task
Action

Let's understand the situation

so far everything works good we have successfully scaled both database and application layer.

Hence here comes the next `system design concept` named Caching.

Concept: Cache Tier

A cache is like a quick-access storage that keeps important or often-used information in memory. This helps make future requests for that information faster. When a new web page loads, it usually needs data from a database. If we keep asking the database over and over, it can slow things down. But, with a cache, we can avoid this problem.

Now, think of the cache tier as a layer which reduces the burden on the main database and we can scale this independently.

For caching, we have added 2 main components:

CDN (Content Delivery Network): A CDN is a distributed network of servers that helps deliver content, like images and scripts, closer to users. It reduces latency and speeds up content delivery by storing copies of data in various locations.
Cache Node: The Cache Node is a dedicated component responsible for storing frequently accessed data in memory. It acts as a quick-access storage, ensuring faster response times by serving data directly from the cache rather than fetching it from the original source, such as a database.

Situation
Issue
Task
Action

Let's understand the situation

so far everything works good we have successfully scaled both database and application layer.
Additionally, because our data is cached, less network calls are made, lowering costs and improving read performance.

Hence here comes the next `system design concept` named Distributed Caching.

Concept: Distributed Cache and Consistent Hashing

Now currently we have large amount of data which is there in a single cache node.so to scale we will use sharding process in which we will create multiple shards and distribute the original data into multiple shards. now to decide which data should go to which shard we will use technique like consistent hashing.check about consistent hashing in below image.

Process we followed

first we will create a configuration list which will consists of information related to all the host of servers where shard is present.
second we will install the cache client on all web-server which is use to communicate with cache server.
now using consistent hashing we will determine the shard from where we can read or write the data.

Note: Cache is a key-value store like a hash map and to write or read data we need some key.

Situation
Issue
Task
Action

Let's understand the situation

so far everything works good we have successfully scaled both database, application and cache layer according to increase in demand.

Hence we need to convert long running API's from `synchronous to asynchronous`.

Hence here comes the next `system design concept` named Message Queue.

Concept: Message Queue

A message queue is like a to-do list. Each task you need to do is a message, and the queue is where you keep these messages in order. When you finish one task, you take the next message from the queue and work on that.

Producer is responsible to put message/task in the queue and Consumer is responsible to pick the messages and process.

Process we followed:

First we integrated message broker / message queue in our system and run this with our application start.
Adding message queue flow in reports api:

when merchant hit this api for requesting report we will create a new entry in database with report_id,merchant_id,status=PENDING and put all this information in message_queue.

Adding message queue flow in image upload api:

when merchant hit this api for uploading image we will create a new entry in database with merchant_id,image_name,image_binary_data,status=PENDING and put all this information in message_queue.

Next we will set up the consumer and also run this with our application start, consumer will keep an eye / subscribe on queue.
Consumer will pick message / task metadata from queue and process that message and once done will update the entry in database with status=DONE.

Final Thoughts

info

In conclusion, system design is an iterative process where there's often no absolute right or wrong.
The effectiveness of a design solution depends on various factors, including the specific requirements, constraints, and the dynamic nature of the system.
It's important to note that 'it depends' is a common refrain in system design. Solutions are not one-size-fits-all, and the optimal design choice can vary based on the context, goals, and the unique characteristics of the problem at hand. Remember, the journey of system design is a learning process, and each decision contributes to a deeper understanding of the system's intricacies
Throughout the design journey, it's crucial to measure trade-offs carefully, considering factors such as performance, scalability, maintainability, and cost.

Concept: Vertical Scaling​

What is Vertical scaling ?​

Main dimensions which is increased during this process is:​

process we followed:​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need a solution which will overcome this limitation.

Hence here comes the next system design concept named Horizontal Scaling.

Concept: Horizontal Scaling​

What is Horizontal scaling ?​

process we followed:​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need some layer/system which is responsible to balance the load between servers and remove those limitation discussed previously.

Hence here comes the next system design concept named Load Balancer.

Concept: Load Balancer​

Process we followed for utilizing load balancer in our architecture:​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need to separate the database from application servers and all application servers should use same database for consistent data.

Hence here comes the next system design concept named Stateless Web Tier.

Stateless Web Tier​

Three Tier architecture​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need to scale database layer as well.

Hence here comes the next system design concept named Database Scaling.

Database Scaling​

Failure Scenarios:​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need to add cache tier to our system to reduce the unnecessary network calls.

Hence here comes the next system design concept named Caching.

Concept: Cache Tier​

For caching, we have added 2 main components:​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need to distribute the data present in cache node among multiple shards.

Hence here comes the next system design concept named Distributed Caching.

Concept: Distributed Cache and Consistent Hashing​

Process we followed​

Let's understand the situation

Now let's understand the current issue which we have observed.

Hence we need to convert long running API's from synchronous to asynchronous.

Hence here comes the next system design concept named Message Queue.

Concept: Message Queue​

Process we followed:​

Final Thoughts​

Concept: Vertical Scaling

What is Vertical scaling ?

Main dimensions which is increased during this process is:

process we followed:

Hence here comes the next `system design concept` named Horizontal Scaling.

Concept: Horizontal Scaling

What is Horizontal scaling ?

process we followed:

Hence here comes the next `system design concept` named Load Balancer.

Concept: Load Balancer

Process we followed for utilizing load balancer in our architecture:

Hence here comes the next `system design concept` named Stateless Web Tier.

Stateless Web Tier

Three Tier architecture

Hence we need to scale `database layer` as well.

Hence here comes the next `system design concept` named Database Scaling.

Database Scaling

Failure Scenarios:

Hence here comes the next `system design concept` named Caching.

Concept: Cache Tier

For caching, we have added 2 main components:

Hence here comes the next `system design concept` named Distributed Caching.

Concept: Distributed Cache and Consistent Hashing

Process we followed

Hence we need to convert long running API's from `synchronous to asynchronous`.

Hence here comes the next `system design concept` named Message Queue.

Concept: Message Queue

Process we followed:

Final Thoughts