Skip to main content

Scaling Concepts

Concept: Vertical Scaling

What is Vertical scaling ?

vertical scaling or scaling up is a process of increasing the capacity of a single server or resource by adding more power to it.we called this scale up/vertical scaling because in this we are increasing powers to the same servers.

Main dimensions which is increased during this process is:

Virtual CPU, RAM/Memory, Block Storage, Network Bandwidth.

process we followed:

  1. Backup Data: In this step we make sure that our existing data should not be lost for this we did nothing because all the data is in block storage and we can use the same block storage with new virtual machine.
  2. Plan Downtime Window: so here we decide the timings when we will do this process and also notify the customers about that.
  3. Change Instance Type: Now in this step we came up with the instance type which provide more VPCU, Memory , Network Bandwidth.
  4. Restart Instance: for changing instance type we stopped the instance and once changed we restart the server.
  5. Run Test: Now simply run some test(Automation test) and check again the metrics of the server.
Loading...

Let's understand the situation

  • so now after scaling things up we can process more requests out number of number of vCPU is increased.

  • all the metrics in dashboard is under the threshold limit.

Concept: Horizontal Scaling

What is Horizontal scaling ?

horizontal scaling or scaling out is a method of increasing the capacity or performance of a system by adding more hardware or resources to distribute the load. we called this scale out/horizontal scaling because in this we are increasing adding more servers parallel to existing servers.

process we followed:

  1. we simply start with creating some general purpose instances,deploy application to all the servers with databases.
  2. we then added there public IP's in DNS System.
  3. now as there are more then one server we need to put logging and monitoring stack to all the server.
  • centralized logging: fluent bit agent is installed on all the servers which will push logs to cloudwatch.
  • centralized metrics collection: added prometheus and node exporters to all the server to push metrics to blob store like s3.

Issues from vertical scaling which is addressed by horizontal scaling method:

  1. Downtime
  2. Single Point of faliure
  3. Provide Elasticity
  4. Cost effective
Loading...

Let's understand the situation

  • so now we have multiple application server to handle the incoming request in distributed manner.

  • so whenever we need to scale we just need to add the new application server and add this to DNS management dashboard.

  • DNS will route the traffic to application server based on there turn in round robin fashion.

Concept: Load Balancer

A load balancer is a networking device or software application designed to distribute incoming network traffic across multiple servers or resources. The primary purpose of a load balancer is to ensure that no single server bears too much traffic load

Process we followed for utilizing load balancer in our architecture:

  1. Health Check Endpoint: as you know that in each application server we have reverse proxy(nginx), backend and frontent is there. so in reverse-proxy/nginx config we have to add /health-check API which can be used by load balancer to determine availability of a server.
  2. Register Instances: register all the instances among which load balancer will route the traffic.
  3. Configure Load Balancing Algorithm: there are some common algorithms which is used to select the application server to route traffic.
  4. SSL Certificate: in this step we need to add the SSL certificate and use HTTPS protocol for encrypting the request.

after adding load balancer we can make all application server to be private and make load_balancer as public facing and add IP of load balancer to DNS else remove everything from DNS dashboard.

Loading...

Let's understand the situation

  • so far everything works good we have load balancer which helps us to manage load among different application servers.

  • so if we have to scale we simply spin new application server run the script to install the application and register this server to notify load balancer to consider this new machine as well.

Stateless Web Tier

From above issue we can learn that web server or server which consist of application(frontend, backend) should not store any data/state and hence Web server should be stateless.

Three Tier architecture

From above 3 tier system we can see that web tier and data tier is separate thing which promote scalability without any data inconsistency.

Loading...

Let's understand the situation

  • so far everything works good we have separated both application and database layer.

Database Scaling

The current design faces limitations with a single database, lacking failover support, and struggling to meet database requests from all application servers. To address these challenges, various techniques can be employed to scale the database.

Common Scaling Technique: Database Replication

One of the most widely adopted techniques for scaling databases is Database Replication. This approach involves the creation of a master node (write replica) and multiple slave nodes (read replicas).

  1. Master Node (Write Replica):

    • This serves as the primary database instance where write operations occur.
    • It is responsible for handling data modifications and updates.
  2. Slave Nodes (Read Replicas):

    • Multiple read replicas are created to distribute the read workload.
    • These replicas mirror the data from the master node.
    • They can be strategically placed to serve specific application servers or geographical regions.

Failure Scenarios:

  1. Master Node Failure:

    • If the master node goes down:
      • A failover mechanism should be in place to promote one of the read replicas to become the new master.
      • The application servers can then redirect write requests to the new master node.
      • This ensures minimal downtime and maintains write functionality.
  2. Slave Node Failure:

    • If a read replica (slave node) goes down:
      • Read traffic can be redirected to other available replicas to ensure continued service.
      • Data redundancy in multiple replicas mitigates the impact of a single node failure on read operations.
      • Regular monitoring and automated recovery processes can be implemented to replace or repair the failed replica.
Loading...

Let's understand the situation

  • so far everything works good we have successfully scaled both database and application layer.

Concept: Cache Tier

A cache is like a quick-access storage that keeps important or often-used information in memory. This helps make future requests for that information faster. When a new web page loads, it usually needs data from a database. If we keep asking the database over and over, it can slow things down. But, with a cache, we can avoid this problem.

Now, think of the cache tier as a layer which reduces the burden on the main database and we can scale this independently.

For caching, we have added 2 main components:

  1. CDN (Content Delivery Network): A CDN is a distributed network of servers that helps deliver content, like images and scripts, closer to users. It reduces latency and speeds up content delivery by storing copies of data in various locations.
  2. Cache Node: The Cache Node is a dedicated component responsible for storing frequently accessed data in memory. It acts as a quick-access storage, ensuring faster response times by serving data directly from the cache rather than fetching it from the original source, such as a database.
Loading...

Let's understand the situation

  • so far everything works good we have successfully scaled both database and application layer.

  • Additionally, because our data is cached, less network calls are made, lowering costs and improving read performance.

Concept: Distributed Cache and Consistent Hashing

Now currently we have large amount of data which is there in a single cache node.so to scale we will use sharding process in which we will create multiple shards and distribute the original data into multiple shards. now to decide which data should go to which shard we will use technique like consistent hashing.check about consistent hashing in below image.

Process we followed

  1. first we will create a configuration list which will consists of information related to all the host of servers where shard is present.
  2. second we will install the cache client on all web-server which is use to communicate with cache server.
  3. now using consistent hashing we will determine the shard from where we can read or write the data.

Note: Cache is a key-value store like a hash map and to write or read data we need some key.

Loading...

Let's understand the situation

  • so far everything works good we have successfully scaled both database, application and cache layer according to increase in demand.

Concept: Message Queue

A message queue is like a to-do list. Each task you need to do is a message, and the queue is where you keep these messages in order. When you finish one task, you take the next message from the queue and work on that.

Producer is responsible to put message/task in the queue and Consumer is responsible to pick the messages and process.

Process we followed:

  1. First we integrated message broker / message queue in our system and run this with our application start.
  2. Adding message queue flow in reports api:
  • when merchant hit this api for requesting report we will create a new entry in database with report_id,merchant_id,status=PENDING and put all this information in message_queue.
  1. Adding message queue flow in image upload api:
  • when merchant hit this api for uploading image we will create a new entry in database with merchant_id,image_name,image_binary_data,status=PENDING and put all this information in message_queue.
  1. Next we will set up the consumer and also run this with our application start, consumer will keep an eye / subscribe on queue.
  2. Consumer will pick message / task metadata from queue and process that message and once done will update the entry in database with status=DONE.
Loading...

Final Thoughts

info
  1. In conclusion, system design is an iterative process where there's often no absolute right or wrong.
  2. The effectiveness of a design solution depends on various factors, including the specific requirements, constraints, and the dynamic nature of the system.
  3. It's important to note that 'it depends' is a common refrain in system design. Solutions are not one-size-fits-all, and the optimal design choice can vary based on the context, goals, and the unique characteristics of the problem at hand. Remember, the journey of system design is a learning process, and each decision contributes to a deeper understanding of the system's intricacies
  4. Throughout the design journey, it's crucial to measure trade-offs carefully, considering factors such as performance, scalability, maintainability, and cost.