++++

Engineering

Mar 2025×10 min read

Load balancing is the process of distributing incoming network traffic across multiple backend servers to ensure efficient utilization, prevent overload, and improve system availability.

Load Balancing

Driptanil DattaSoftware Developer

🌍

References & Disclaimer

This content is adapted from Mastering System Design from Basics to Cracking Interviews (Udemy). It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.

Why Load Balancing is Needed

Load balancing is essential for modern distributed systems:

✅ High Availability – Automatically redirects traffic if a server fails, ensuring zero downtime.
✅ Scalability – Allows seamless horizontal scaling by adding more servers to the pool.
✅ Better Performance – Distributes requests to the most available or least loaded servers.
✅ Efficiency – Ensures no single server becomes a bottleneck while others sit idle.
✅ SSL Termination – Offloads resource-heavy decryption from application servers.

Types of Load Balancers

Based on Layer

Layer 4 (Transport Layer): Operates at the TCP/UDP level, distributing requests based on network-level data (IPs and Ports). Fast but less flexible.
Layer 7 (Application Layer): Operates at the HTTP/HTTPS level, making routing decisions based on request content (URL, Headers, Cookies). Intelligent but adds overhead.

Based on Deployment

Hardware Load Balancers: Specialized physical devices (e.g., F5, Citrix NetScaler).
Software Load Balancers: Applications running on standard hardware (e.g., Nginx, HAProxy, Envoy).
Cloud-based Load Balancers: Managed services like AWS Elastic Load Balancer (ELB), Google Cloud Load Balancing.

Load Balancing Strategies

Static Load Balancing

Round Robin: Sends requests sequentially to each server.
Least Connections: Directs traffic to the server with the fewest active connections.
IP Hashing: Routes requests based on a hash of the client's IP address.

Dynamic Load Balancing

Least Response Time: Sends requests to the server with the fastest response.
Adaptive Load Balancing: Uses real-time monitoring to make routing decisions.
Weighted Load Balancing: Assigns different weights to servers based on their capacity.

Interview Questions & Answers

1. What is load balancing, and why is it important?

Load balancing is the process of distributing incoming network traffic across multiple backend servers to ensure efficient utilization, prevent overload, and improve system availability.

Ensures High Availability: Prevents downtime by redirecting traffic in case of server failure. - Optimizes Resource Utilization: Spreads requests evenly. - Improves Performance: Reduces latency by routing to best-performing servers. - Enhances Scalability: Supports horizontal scaling.

2. Explain the difference between Layer 4 and Layer 7 load balancing.

Layer 4: Faster, operates at TCP/UDP, uses IP/Port data.
Layer 7: Intelligent, operates at HTTP/HTTPS, uses content/headers.

3. How does a load balancer handle high availability and failover?

Health Checks: Continuously monitoring server health.
Automatic Failover: Redirecting traffic to healthy servers instantly.
Session Persistence: Maintaining user state across requests (Sticky Sessions).

4. Compare Round Robin and Least Connections strategies.

Round Robin: Circular distribution. Best for uniform workloads.
Least Connections: Dynamic distribution. Best for long-running requests.

5. How does a load balancer improve security?

DDoS Protection: Blocking malicious traffic spikes.
SSL Termination: Offloading decryption from backend servers.
Rate Limiting: Preventing abuse by limiting requests per IP.

Final Thoughts

If you want speed and simple distribution, go for Layer 4. If you want intelligent routing and deep inspection, Layer 7 is the way to go.