The Complete AWS Elastic Load Balancing (ELB) Tutorial (2025): Types, Architecture, Routing, Scaling & Real-World DevOps Use Cases
Last Updated: 2025 | Reading Time: 18 minutes | By Srikanth Ch, Senior DevOps Engineer
Table of Contents: AWS Elastic Load Balancing (ELB) Tutorial
Introduction to AWS Elastic Load Balancing
Picture this scenario. You’ve just deployed your shiny new application on an EC2 instance. Traffic starts flowing, users are happy, and everything looks great. Then suddenly, a viral tweet sends thousands of users your way. Your single instance chokes, response times spike, and your application crashes. That’s the exact moment you wish you had a load balancer sitting in front of your infrastructure.
AWS Elastic Load Balancing (ELB) is the managed service that prevents this nightmare scenario. Think of it as a traffic police officer standing at a busy intersection, directing cars (requests) to whichever lane (server) has the least congestion. Without that officer, you’d have traffic jams, accidents, and frustrated drivers everywhere.
But ELB does much more than just distribute traffic. It monitors the health of your targets, automatically routes around failures, and scales with your application’s demands. In production environments, running without a load balancer is like driving without insurance—everything’s fine until it isn’t.
Real-world scenarios where ELB saves the day:
- Routing traffic only to healthy EC2 instances while unhealthy ones recover
- Acting as the front door for microservices architectures
- Load balancing containerized workloads on ECS and EKS
- Absorbing sudden traffic spikes during flash sales or viral moments
- Enabling zero-downtime deployments through blue/green strategies
If you’re preparing for the AWS Solutions Architect (SAA-C03) or DevOps Engineer Professional certification, understanding ELB inside out is non-negotiable. Let’s dive deep into how it all works.
ELB Architecture Overview
Before we explore the different types of load balancers, you need to understand what’s happening under the hood. A load balancer isn’t just a single component—it’s an orchestrated system of parts working together.
What Does a Load Balancer Actually Do?
At its core, a load balancer accepts incoming traffic (ingress), decides where to send it (routing), verifies that destinations are healthy (health checks), and forwards the request to the appropriate backend (targets). This happens in milliseconds, thousands of times per second.
Core Components of AWS ELB
Load Balancer Nodes
When you create an ELB, AWS provisions load balancer nodes in each Availability Zone you specify. These nodes are the actual infrastructure handling your traffic. More AZs mean better fault tolerance but also more cost.
Listeners
A listener is a process that checks for connection requests. You configure listeners with a protocol and port—for example, HTTPS on port 443. When traffic hits that port, the listener evaluates the rules to determine where to route it.
Target Groups
Target groups are logical collections of targets (EC2 instances, containers, Lambda functions, or IP addresses) that receive traffic from the load balancer. You can have multiple target groups behind a single load balancer, each serving different purposes.
Health Checks
ELB continuously pings your targets to verify they’re healthy. If a target fails health checks, ELB stops sending traffic to it until it recovers. This is your automatic failover mechanism.
Availability Zones
For high availability, you should always deploy your load balancer across multiple AZs. If one AZ goes down, traffic automatically routes to healthy targets in other AZs.
Understanding Authentication vs Authorization
When configuring HTTPS listeners, it’s important to distinguish between authentication (verifying who the user is) and authorization (determining what they’re allowed to do). ALB can handle authentication natively through OIDC integration, but authorization logic typically lives in your application.
The Traffic Flow
Here’s the path every request takes:
Client → DNS Resolution → Load Balancer → Listener Evaluation →
Target Group Selection → Health Check Verification → Target (EC2/ECS/Lambda)
Understanding this flow helps you troubleshoot when things go wrong. If users report errors, you can trace the request path to identify exactly where the failure occurred.
Types of Elastic Load Balancers
AWS offers four types of load balancers, each designed for specific use cases. Choosing the right one is critical for both performance and cost optimization.
Application Load Balancer (ALB) — Layer 7
The Application Load Balancer operates at the application layer (Layer 7 of the OSI model), which means it understands HTTP and HTTPS traffic. This intelligence enables sophisticated routing decisions based on the actual content of requests.
What makes ALB special:
- Path-based routing: Route
/api/*requests to your API servers and/images/*to your CDN origin - Host-based routing: Send
api.example.comto one target group andadmin.example.comto another - HTTP header-based routing: Route based on custom headers, cookies, or query strings
- WebSocket support: Maintain persistent connections for real-time applications
- gRPC support: Native support for gRPC protocol used in modern microservices
- Built-in authentication: Integrate with Cognito or any OIDC-compliant identity provider
Ideal use cases for ALB:
- Microservices architectures requiring content-based routing
- Container workloads on ECS or EKS
- Applications using WebSockets or Server-Sent Events
- Multi-tenant SaaS applications with host-based routing
- API gateways needing request inspection
Network Load Balancer (NLB) — Layer 4
The Network Load Balancer operates at the transport layer (Layer 4), handling TCP, UDP, and TLS traffic. It doesn’t inspect packet contents—it just moves them extremely fast.
What makes NLB special:
- Extreme performance: Handles millions of requests per second with ultra-low latency
- Static IP addresses: Each NLB gets a static IP per AZ, crucial for whitelisting
- Elastic IP support: Bring your own IPs for even more control
- Preserves source IP: Target instances see the client’s actual IP address
- TCP/UDP passthrough: No protocol modification or termination
Ideal use cases for NLB:
- Gaming servers requiring UDP support
- Financial trading platforms needing microsecond latency
- IoT backends handling millions of device connections
- Applications requiring static IPs for firewall whitelisting
- TCP-based protocols like SMTP, MQTT, or custom protocols
Gateway Load Balancer (GWLB) — Layer 3
The Gateway Load Balancer is the newest addition, designed specifically for deploying, scaling, and managing third-party virtual appliances. It operates at Layer 3 (network layer) and uses the GENEVE protocol on port 6081 for encapsulation—a detail that frequently appears in advanced networking exams and is critical for firewall rule configurations.
What makes GWLB special:
- Transparent inspection: Traffic flows through appliances without network changes
- Horizontal scaling: Automatically scales your security appliances
- High availability: Monitors appliance health and routes around failures
- Single entry/exit point: Simplifies network architecture
Ideal use cases for GWLB:
- Deploying next-generation firewalls (Palo Alto, Fortinet, Check Point)
- Intrusion Detection/Prevention Systems (IDS/IPS)
- Deep packet inspection appliances
- Network traffic analysis and monitoring
- Compliance-required traffic inspection
Classic Load Balancer (CLB) — Deprecated
The Classic Load Balancer is AWS’s original offering, supporting both Layer 4 and Layer 7. AWS has marked it as deprecated, and you shouldn’t use it for new deployments.
Why you might still encounter CLB:
- Legacy applications that haven’t been migrated
- Older AWS accounts with long-running infrastructure
- Some edge cases where migration isn’t worth the effort
💡 Reflection Prompt: Which ELB type would you choose for a real-time multiplayer game using WebSockets for chat but requiring UDP for game state synchronization? Think about whether you’d need one or multiple load balancers.
ELB Quick Comparison: ALB vs NLB vs GWLB
Before diving into the technical details, here’s a quick reference table to help you choose the right load balancer. Bookmark this one—it’s a lifesaver during architecture reviews and certification exams.
| Feature | ALB (Layer 7) | NLB (Layer 4) | GWLB (Layer 3) |
|---|---|---|---|
| OSI Layer | Application (L7) | Transport (L4) | Network (L3) |
| Protocols | HTTP, HTTPS, gRPC, WebSocket | TCP, UDP, TLS | IP (GENEVE on port 6081) |
| Performance | High | Ultra-high (millions req/sec) | High |
| Latency | Low | Ultra-low (microseconds) | Low |
| Static IP | ❌ No (DNS name only) | ✅ Yes (Static + Elastic IP) | ❌ No |
| Preserves Source IP | Via X-Forwarded-For header | ✅ Yes (native) | ✅ Yes |
| Path-Based Routing | ✅ Yes | ❌ No | ❌ No |
| Host-Based Routing | ✅ Yes | ❌ No | ❌ No |
| Health Check Level | HTTP/HTTPS endpoints | TCP/HTTP | HTTP/HTTPS/TCP |
| SSL/TLS Termination | ✅ Yes | ✅ Yes (TLS listener) | ❌ No |
| WAF Integration | ✅ Yes | ❌ No | ❌ No |
| Authentication (OIDC) | ✅ Yes | ❌ No | ❌ No |
| Primary Use Cases | Microservices, Web Apps, APIs | Gaming, Finance, IoT, TCP/UDP | Firewalls, IDS/IPS, Traffic Inspection |
| Cross-Zone LB Default | Enabled | Disabled | Disabled |
| Pricing Model | LCU-based | GB processed + hours | GWLBE hours + GB processed |
Quick Decision Framework:
- Need content-based routing or authentication? → ALB
- Need static IPs, UDP, or extreme performance? → NLB
- Need to inspect traffic with security appliances? → GWLB
- Running legacy EC2-Classic workloads? → CLB (but plan migration)
Listeners, Rules & Target Groups
This is where ELB’s power really shines. Understanding how listeners, rules, and target groups work together lets you build sophisticated routing architectures.
Listener Protocols
Different load balancer types support different protocols:
ALB Listeners: HTTP (80), HTTPS (443)
NLB Listeners: TCP, UDP, TLS, TCP_UDP
GWLB Listeners: GENEVE (6081)
When you create an HTTPS listener, you attach an SSL/TLS certificate (typically from AWS Certificate Manager). The load balancer terminates TLS, decrypts the traffic, and forwards it to targets—this is called TLS termination or SSL offloading.
Path-Based Routing Example
Imagine you’re running an e-commerce platform. You want different backend services handling different URL paths:
/api/* → API Service Target Group (port 3000)
/checkout/* → Payment Service Target Group (port 8080)
/static/* → Static Assets Target Group (port 80)
/* → Default Web App Target Group (port 80)
Each rule evaluates in priority order. When a request for /api/products arrives, the first matching rule routes it to your API service.
Host-Based Routing Example
For multi-tenant applications or when consolidating multiple services behind one ALB:
api.example.com → API Target Group
admin.example.com → Admin Dashboard Target Group
blog.example.com → WordPress Target Group
*.example.com → Default Target Group
This approach reduces costs by eliminating the need for separate load balancers per service.
Weighted Target Routing
This newer feature enables canary deployments directly at the load balancer level:
Target Group v1 (current version): 90% traffic
Target Group v2 (new version): 10% traffic
Gradually shift traffic from v1 to v2, monitoring for errors. If something goes wrong, quickly route all traffic back to v1.
Target Group Types
Target groups can contain different types of targets:
Instance targets: Traffic routes to EC2 instances by instance ID. Health checks hit the instance directly.
IP targets: Traffic routes to specific IP addresses. Useful for on-premises servers, containers with awsvpc networking, or cross-VPC routing.
Lambda targets: ALB invokes Lambda functions directly. Great for serverless backends without API Gateway overhead.
Health Check Configuration
Health checks are your early warning system. Configure them thoughtfully:
Protocol: Usually HTTP or HTTPS for ALB, TCP for NLB
Path: For HTTP checks, the endpoint to hit (e.g., /health or /api/status)
Healthy threshold: How many consecutive successes before marking healthy (typically 2-3)
Unhealthy threshold: How many failures before marking unhealthy (typically 2-3)
Timeout: How long to wait for a response (typically 5-10 seconds)
Interval: Time between checks (typically 30 seconds, minimum 5 seconds)
💡 Reflection Prompt: How would you design health checks differently for a production-facing API versus a background job worker? Consider what “healthy” means for each.
ELB Security Best Practices
Security in load balancing isn’t optional—it’s foundational. Your load balancer is often the entry point to your entire infrastructure.
HTTPS Everywhere
Never expose HTTP endpoints to the internet in production. Always terminate TLS at your load balancer:
- Create or import certificates in AWS Certificate Manager (ACM)
- Attach certificates to your HTTPS listeners
- Redirect HTTP (port 80) to HTTPS (port 443) using listener rules
- Enforce TLS 1.2 minimum—disable TLS 1.0 and 1.1
🔒 Security Tip: Always enable TLS 1.2 or higher. Disable outdated ciphers like RC4 and 3DES. Use the
ELBSecurityPolicy-TLS-1-2-2017-01policy or newer.
Security Groups for ALB
ALBs use security groups to control inbound traffic. A typical configuration:
Inbound rules:
- HTTPS (443) from 0.0.0.0/0 (public internet)
- HTTP (80) from 0.0.0.0/0 (for redirect rule only)
Backend security groups should:
- Allow traffic only from the ALB’s security group
- Never allow direct internet access to backend instances
NLB Security Considerations
NLBs don’t use security groups—they’re transparent. Security groups on your target instances must allow traffic from client IP addresses directly, since NLB preserves source IPs.
For private NLBs, this is fine. For public NLBs, you’ll need to allow 0.0.0.0/0 on the target port, which might feel uncomfortable. Compensate with NACLs (Network ACLs) at the subnet level.
AWS WAF Integration
AWS Web Application Firewall (WAF) integrates directly with ALB to protect against common attacks:
- SQL injection
- Cross-site scripting (XSS)
- Known malicious IP addresses
- Geographic restrictions
- Rate limiting
Deploy WAF rules based on your application’s risk profile. Start with AWS Managed Rules and customize from there.
Logging for Security and Compliance
ALB Access Logs: Detailed request-level logging to S3, capturing client IP, request path, response codes, and latency. Enable these for security forensics and troubleshooting.
NLB Flow Logs: VPC Flow Logs capture network-level information. Less detailed than ALB logs but useful for network analysis.
Enable logging from day one—you can’t retroactively capture logs from before they were enabled.
ELB Monitoring, Logging & Troubleshooting
When things break at 3 AM, you need to find the problem fast. Proper monitoring setup is your flashlight in the dark.
CloudWatch Metrics That Matter
For ALB:
- RequestCount: Total requests processed
- TargetResponseTime: How long targets take to respond (watch for spikes)
- HTTPCode_ELB_5XX: Errors generated by the load balancer itself
- HTTPCode_Target_5XX: Errors returned by your targets
- HealthyHostCount / UnhealthyHostCount: Target health status
- ActiveConnectionCount: Current concurrent connections
For NLB:
- ProcessedBytes: Data throughput
- ActiveFlowCount: Current TCP/UDP flows
- NewFlowCount: New connections per second
- TCP_Client_Reset_Count: Connection resets from clients
- TCP_Target_Reset_Count: Connection resets from targets
Deciphering 4xx vs 5xx Errors
4xx errors (client errors):
- 400: Bad request (malformed syntax)
- 401/403: Authentication/Authorization issues
- 404: Target returned not found
- 460: Client closed connection before load balancer could respond
- 463: X-Forwarded-For header with too many IPs
5xx errors (server errors):
- 500: Target returned internal error
- 502: Bad gateway (target sent invalid response)
- 503: Service unavailable (no healthy targets or target group not configured)
- 504: Gateway timeout (target didn’t respond in time)
When troubleshooting 504 errors, check target response times. If targets are slow, increase idle timeout on the load balancer.
Using Access Logs Effectively
Access logs answer the question “what exactly happened?” Enable them to S3:
bucket-name/prefix/AWSLogs/aws-account-id/elasticloadbalancing/region/yyyy/mm/dd/
Each log entry includes timestamp, client IP, target IP, request processing time, target processing time, response code, and more. Use Amazon Athena to query logs at scale.
Request Tracing with X-Amzn-Trace-Id
ALB automatically adds the X-Amzn-Trace-Id header to requests. Use this to correlate logs across your entire stack—from the load balancer through your application to downstream services.
CloudTrail for ELB Events
CloudTrail captures API calls made to ELB. Track who created, modified, or deleted load balancers, target groups, and listeners. Essential for security audits and change management.
💡 Reflection Prompt: You’re seeing intermittent 504 errors during peak traffic. The surge queue length metric is increasing. What’s likely happening, and how would you fix it?
ELB with EC2, ECS, EKS, and Lambda
Load balancers don’t exist in isolation—they integrate with your compute layer. Here’s how different services work together.
Auto Scaling Groups + ALB
The classic pattern. When you attach an Auto Scaling Group to an ALB target group:
- New instances automatically register with the target group
- Terminated instances automatically deregister
- Health checks from ALB can trigger instance replacement
- Scaling policies respond to load balancer metrics
Configure your ASG to use ELB health checks instead of just EC2 health checks. An instance might be running but your application could be broken—ALB health checks catch this.
ECS Services with ALB/NLB
For containerized workloads, ECS integrates beautifully with load balancers:
- Dynamic port mapping: Containers can run on any port; ECS registers them automatically
- Service discovery: Combined with Cloud Map for internal service-to-service communication
- Blue/green deployments: Use CodeDeploy with ALB for zero-downtime deployments
When creating an ECS service, specify the target group and container port. ECS handles the rest.
EKS Ingress with ALB
In Kubernetes land, the AWS Load Balancer Controller creates and manages ALBs for your Ingress resources:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
spec:
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
The controller watches for Ingress objects and creates corresponding ALB resources.
Lambda Behind ALB
ALB can invoke Lambda functions directly, bypassing API Gateway:
- Lower cost for simple use cases
- No API Gateway features (throttling, caching, request validation)
- Great for serverless microservices behind a shared load balancer
Register your Lambda function as a target, and ALB converts HTTP requests to Lambda events.
Cross-Zone Load Balancing
By default, each load balancer node distributes traffic only to targets in its own Availability Zone. With cross-zone load balancing enabled, traffic distributes evenly across all registered targets in all AZs.
When to enable: When you have uneven target distribution across AZs
When to disable: For data locality requirements or cost optimization (NLB charges for cross-zone data transfer)
ALB has cross-zone enabled by default. NLB has it disabled by default (to avoid cross-AZ data transfer charges).
Advanced ELB Features
Once you’ve mastered the basics, these advanced features let you build sophisticated architectures.
Sticky Sessions (Session Affinity)
When your application stores session state locally (not ideal, but common), sticky sessions ensure a user’s requests always go to the same target:
- Duration-based stickiness: ALB generates a cookie; you set the expiration
- Application-based stickiness: Your app generates the cookie; ALB respects it
Sticky sessions can cause uneven load distribution. If one target gets “stuck” with heavy users, it becomes overloaded while others sit idle.
⚠️ Deployment Warning: Sticky sessions can break zero-downtime deployment strategies. During blue/green or rolling deployments, users with active sessions remain “stuck” on old instances. Even after new instances are healthy, these users won’t see the updated version until their session cookie expires or they clear cookies. For true zero-downtime deployments, externalize session state to ElastiCache or DynamoDB instead of relying on sticky sessions.
Connection Draining (Deregistration Delay)
When you remove a target from a target group (during deployments or scale-in), existing connections need to complete gracefully. Deregistration delay gives in-flight requests time to finish.
Default: 300 seconds
Minimum: 0 seconds (immediate deregistration)
Maximum: 3600 seconds
Set this based on your longest expected request duration. Too short and you’ll see dropped connections; too long and deployments take forever.
ALB OIDC Authentication
ALB can authenticate users before requests reach your application. Important: OIDC authentication requires an HTTPS listener—it cannot be configured on HTTP listeners. This is a critical implementation detail that catches many engineers off guard.
Here’s how the authentication flow works:
- User hits ALB on HTTPS listener
- ALB redirects to identity provider (Cognito, Okta, Auth0, etc.)
- User authenticates with the IdP
- ALB validates the token and forwards the authenticated request to targets
- ALB includes user claims in
X-Amzn-Oidc-*headers
This offloads authentication logic from your application entirely.
WebSockets and gRPC Support
ALB natively supports WebSocket connections for real-time applications. Once the HTTP upgrade completes, ALB maintains the persistent connection.
For gRPC, ALB supports HTTP/2 end-to-end. Configure your target group protocol as HTTP/2, and enable gRPC protocol version.
ALB + CloudFront
Placing CloudFront in front of ALB provides:
- Global edge caching for static content
- DDoS protection via AWS Shield
- Geographic restrictions
- Additional WAF deployment point
- Custom error pages
Use origin request policies to control which headers CloudFront forwards to ALB.
NLB TCP/UDP Passthrough
NLB doesn’t modify packets—it passes them directly to targets. This transparency is crucial for:
- Protocols that embed IP addresses in payloads
- Applications requiring client source IP
- Custom protocols that ALB doesn’t support
GWLB Traffic Mirroring
GWLB can mirror traffic to security appliances without affecting the primary traffic path. Useful for:
- Passive monitoring and analysis
- Compliance logging
- Threat detection without inline inspection
🧠 Quiz: A target fails health checks, but the Auto Scaling group considers all instances healthy. Traffic stops flowing to that target, but ASG doesn’t replace it. Why? (Hint: There are two different types of health checks at play.)
📖 Click to reveal the answer
Answer: This happens when the Auto Scaling group is configured to use EC2 status checks instead of ELB health checks. EC2 status checks only verify that the underlying instance is running and passing system/instance status checks—they don’t know anything about your application. Meanwhile, ELB health checks probe your actual application endpoint (like /health).
So when your application crashes but the EC2 instance keeps running, the ELB marks the target as unhealthy and stops sending traffic. But from the ASG’s perspective using EC2 checks, the instance looks perfectly healthy because it’s still “running.”
The fix: Configure your Auto Scaling group to use ELB health checks instead of (or in addition to) EC2 checks. In the ASG settings, set Health Check Type to ELB. Now when ELB marks a target unhealthy, ASG will terminate and replace it automatically.
Common Mistakes to Avoid
Learning from others’ mistakes saves you 3 AM debugging sessions. Here are the pitfalls I’ve seen (and made) repeatedly.
Using ALB for high-throughput UDP workloads
ALB only supports HTTP/HTTPS. For UDP, you need NLB. I’ve seen teams spend days troubleshooting “connectivity issues” before realizing they chose the wrong load balancer type.
Missing or incorrect health check path
If your health check path returns 404, all targets show unhealthy, and you get 503 errors. Always verify your health check endpoint exists and returns 200.
Not enabling cross-zone load balancing on small clusters
With 2 instances in AZ-A and 1 instance in AZ-B, traffic splits 50/50 between AZs. That single instance in AZ-B gets 50% of traffic while the others share the other 50%. Enable cross-zone for even distribution.
Wrong target type: IP vs Instance
If you register an instance by IP address, you can’t also register it by instance ID in the same target group. Pick one approach and stick with it.
Using Classic Load Balancer for new applications
Just don’t. CLB is deprecated. Every feature CLB has, ALB or NLB does better.
Deregistration delay too low
Setting this to 0 for “faster deployments” causes connection drops. Users see errors during deployments. Keep it at least 30 seconds for most applications.
Subnet misconfiguration
Load balancers need subnets in at least two AZs. For internet-facing ALBs, these must be public subnets. I’ve seen “load balancer creation failed” errors traced back to subnet routing table misconfigurations.
Ignoring security group rules on targets
Your beautiful ALB setup means nothing if target security groups don’t allow traffic from the ALB. Always verify end-to-end connectivity.
ELB Pricing and Optimization
Understanding ELB pricing prevents budget surprises and helps you optimize costs.
Pricing Components
Load Balancer Hours: You pay for each hour (or partial hour) that a load balancer runs. Running 24/7? That’s 720 hours/month.
Load Balancer Capacity Units (LCUs): This is where ALB pricing gets interesting. You pay based on the dimension consuming the most LCUs:
- New connections per second
- Active connections per minute
- Processed bytes
- Rule evaluations per second
Data Processing: NLB charges per GB of data processed. ALB includes this in LCU calculations.
ALB vs NLB Cost Comparison
ALB tends to be more expensive for high-throughput, simple workloads because LCU charges can add up quickly with heavy traffic.
NLB has a simpler pricing model and often costs less for raw throughput. But if you need Layer 7 features, you’re paying for ALB regardless.
Real-World Optimization Example
A team running a real-time chat application initially deployed on ALB. Their LCU charges were dominated by active connections—chat users maintain persistent WebSocket connections.
After analysis, they migrated to NLB with TCP passthrough. Result? 40% cost reduction because NLB handles persistent connections more efficiently for their use case.
Cost Optimization Tips
- Consolidate load balancers: Use path-based and host-based routing to serve multiple services from one ALB
- Right-size your architecture: Don’t over-provision AZs you don’t need
- Use NLB for simple TCP workloads: If you don’t need Layer 7 features, NLB is often cheaper
- Monitor LCU metrics: CloudWatch shows exactly which dimension drives your ALB costs
- Delete unused load balancers: They charge even with zero traffic
Visual Reference: AWS ELB Architecture
Caption: AWS Elastic Load Balancing Architecture — Requests flowing through Load Balancer to Target Groups across Availability Zones
Suggested diagram elements:
┌─────────────────────────────────────────────────────────────────┐
│ INTERNET │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ELASTIC LOAD BALANCER │
│ (ALB / NLB / GWLB — Your VPC Entry Point) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Listener │ │ Listener │ │ Listener │ │
│ │ HTTPS:443 │ │ HTTP:80 │ │ TCP:3306 │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Target Group │ │ Target Group │ │ Target Group │
│ (API Service) │ │ (Web App) │ │ (Database) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
┌─────────┴─────────┐ │ ┌─────────┴─────────┐
▼ ▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ EC2 │ │ EC2 │ │ ECS │ │ Lambda │
│ AZ-A │ │ AZ-B │ │ Fargate│ │ │
└────────┘ └────────┘ └────────┘ └────────┘
Design specifications:
- Dark navy background (#0C1A2B)
- Cyan blue (#00D9F5) for load balancer components
- Royal blue (#2A65F5) for target groups
- White text for labels
- Minimalist, clean lines
- Show traffic flow with directional arrows
Conclusion: Mastering ELB for Production-Ready Architectures
Elastic Load Balancing is the unsung hero of highly available AWS architectures. It’s not glamorous, it doesn’t generate buzz on Twitter, but it’s the foundation that keeps your applications running when everything else tries to fail.
Here’s what you should take away from this guide:
Choose wisely: ALB for Layer 7 intelligence, NLB for raw performance and static IPs, GWLB for security appliances.
Design for failure: Configure health checks properly, enable cross-zone load balancing, and plan for AZ outages.
Secure by default: TLS everywhere, WAF protection, proper security groups, and comprehensive logging.
Monitor proactively: CloudWatch dashboards, access logs, and alerts before users notice problems.
Optimize continuously: Review LCU usage, consolidate where possible, and right-size for your actual traffic patterns.
Whether you’re building your first production deployment or architecting enterprise-scale systems, ELB skills compound over time. Every application you build reinforces these patterns until they become second nature
Frequently Asked Questions (FAQs)
What is AWS Elastic Load Balancing?
AWS Elastic Load Balancing (ELB) is a managed service that automatically distributes incoming application traffic across multiple targets—such as EC2 instances, containers, and Lambda functions—in one or more Availability Zones. It improves application fault tolerance, scales with traffic demands, and ensures high availability by routing traffic only to healthy targets.
What are the types of AWS load balancers?
AWS offers four types of load balancers: Application Load Balancer (ALB) for HTTP/HTTPS traffic at Layer 7 with advanced routing capabilities; Network Load Balancer (NLB) for TCP/UDP traffic at Layer 4 with extreme performance; Gateway Load Balancer (GWLB) for deploying and scaling third-party virtual appliances; and Classic Load Balancer (CLB), which is deprecated and not recommended for new deployments.
When should I use ALB vs NLB?
Use ALB when you need Layer 7 features like path-based routing, host-based routing, WebSocket support, or authentication integration. Use NLB when you need extreme performance (millions of requests per second), static IP addresses, UDP support, or when you’re working with non-HTTP protocols. ALB inspects request content; NLB just moves packets fast.
How do health checks work in ELB?
ELB health checks periodically send requests to registered targets to verify they’re operational. You configure the protocol (HTTP, HTTPS, TCP), path for HTTP checks, port, healthy/unhealthy thresholds, timeout, and interval. When a target fails the unhealthy threshold number of consecutive checks, ELB marks it unhealthy and stops routing traffic to it until it passes the healthy threshold number of checks.
Is ELB free in AWS?
No, ELB is not free. You pay for load balancer running hours and usage-based charges. ALB charges based on Load Balancer Capacity Units (LCUs) measuring new connections, active connections, processed bytes, and rule evaluations. NLB charges per GB of data processed. Pricing varies by region, and the AWS Free Tier includes limited ALB/NLB usage for new accounts during the first 12 months.
Can I use multiple target groups with one load balancer?
Yes, you can associate multiple target groups with a single ALB or NLB using listener rules. This enables sophisticated routing patterns where different URL paths or hostnames route to different backend services, reducing infrastructure costs and simplifying management.
What’s the difference between cross-zone and same-zone load balancing?
With same-zone load balancing (cross-zone disabled), each load balancer node distributes traffic only to targets in its own Availability Zone. With cross-zone load balancing enabled, traffic is distributed evenly across all registered targets in all enabled AZs, regardless of which AZ the request entered through. ALB has cross-zone enabled by default; NLB has it disabled by default due to data transfer cost implications.
This guide is part of the AWS DevOps Learning Path on TheDevOpsTooling.com. For more hands-on tutorials, certification prep materials, and real-world DevOps guides, explore our complete course catalog.
Published by Srikanth Ch | Senior DevOps Engineer | TheDevOpsTooling.com
