AWS NACL Tutorial: Deny Rules & VPC Flow Logs (Hands-On Lab)

Introduction

I still remember the 2 AM call that taught me everything I needed to know about Network ACLs. Production traffic was dropping silently, Security Groups looked perfect, and everyone was scratching their heads. Turns out, someone had added a NACL rule with a lower number than expected, and it was blocking traffic before our allow rules ever got evaluated.

That night cost us hours of troubleshooting and a lot of coffee. This lab exists so you never have to learn these lessons the hard way.

In Lab 2.2, you’re going to intentionally break things. You’ll create explicit DENY rules in Network ACLs, watch traffic disappear into a blackhole, and then use VPC Flow Logs to figure out exactly what happened. This is the kind of troubleshooting knowledge that separates engineers who panic during outages from those who calmly diagnose and fix the problem.

Here’s the thing about NACLs that trips up most engineers: they’re stateless. If you’ve been working primarily with Security Groups, your brain is wired for stateful filtering. You allow inbound traffic on port 443, and the response automatically flows back out. NACLs don’t work that way. You need explicit rules in both directions, and forgetting this causes more production incidents than I care to count.

This lab is designed for AWS practitioners who understand the basics of VPC networking and want to develop real troubleshooting skills. Whether you’re preparing for the Solutions Architect exam or building production infrastructure, the concepts here will serve you well.

A quick note on NACLs versus Security Groups: Security Groups are your first line of defense at the instance level. NACLs operate at the subnet boundary. Think of Security Groups as the lock on your apartment door and NACLs as the security desk in the building lobby. Both matter, but they catch different threats at different points.

🔒 Architect Tip: If traffic is blocked by a Network ACL, it never reaches your EC2 instance. Security Groups don’t even get a chance to evaluate it. Always check NACLs first when troubleshooting connectivity issues.

The most common mistake I see beginners make? They configure Security Groups, test their application, everything works, and they never touch NACLs. Then months later, someone adds a NACL rule for a compliance requirement, and suddenly traffic breaks in ways that make no sense if you’re only looking at Security Groups.

Lab Overview

In this hands-on AWS networking lab, you’ll build and intentionally misconfigure Network ACL rules to understand exactly how traffic gets blocked at the subnet level. You’ll enable VPC Flow Logs and learn to read the ACCEPT and REJECT entries that tell you precisely what’s happening to your packets.

By the end of this step-by-step AWS tutorial, you’ll have practical skills in creating DENY rules with proper rule numbering, understanding why stateless filtering requires matching inbound and outbound rules, using VPC Flow Logs to diagnose blocked traffic, and recognizing the difference between NACL blocks and Security Group blocks.

These skills matter in production. During security audits, you’ll need to demonstrate defense-in-depth with both NACLs and Security Groups. When implementing zero-trust architectures, NACLs provide subnet-level isolation that Security Groups alone cannot achieve. And when traffic mysteriously stops flowing, Flow Logs become your best friend for root cause analysis.

I’ve used these exact techniques to diagnose outages where traffic was silently dropped, compliance scenarios requiring explicit deny-all-except policies, and incident response situations where we needed to blackhole malicious traffic immediately.

Prerequisites

Before starting this DevOps hands-on lab, ensure you have an active AWS account with permissions to manage VPC, EC2, and CloudWatch resources. You’ll need an existing VPC with at least one public subnet and one private subnet. Two EC2 instances are required for testing, ideally one in each subnet. A basic understanding of Security Groups is assumed, as we’ll be comparing NACL behavior to what you already know. Finally, you should have AWS CLI configured if you want to follow along with the command-line examples.

If you haven’t set up a VPC yet, check out our foundational lab: VPC Fundamentals.

Step-by-Step Hands-On Lab

Step 1: Navigate to the VPC Console and Locate Network ACLs

Open the AWS Console and navigate to VPC. In the left navigation panel, click on Network ACLs under the Security section.

You’ll see your existing NACLs listed here. Every VPC comes with a default NACL that allows all traffic in both directions. This is by design, but it’s also why many engineers forget NACLs exist until they create custom ones.

Click on the default NACL associated with your VPC. Note the inbound and outbound rules, specifically rules 100 allowing all traffic and the asterisk rule denying everything else. The asterisk rule is the implicit deny that catches anything not explicitly allowed.

A common misconfiguration here is assuming the default NACL is restrictive. It isn’t. If you need restrictive subnet-level filtering, you must create a custom NACL.

Step 2: Identify Subnet-to-NACL Associations

With your NACL selected, click the Subnet associations tab. This shows you which subnets are protected by this NACL.

Here’s something that catches people: a subnet can only be associated with one NACL at a time, but one NACL can protect multiple subnets. If you change the NACL association for a subnet, the old rules stop applying immediately. There’s no overlap or transition period.

Document which subnets are associated with which NACLs. In troubleshooting scenarios, verifying this association is often your first step.

Step 3: Create Explicit Inbound DENY Rules

Now we’re going to intentionally block traffic. Select your NACL and click Edit inbound rules.

Add a new rule with these settings: Rule number 50, Type as Custom TCP, Port range 80, Source 0.0.0.0/0, and Action DENY.

Why rule number 50? Because NACLs evaluate rules in order, lowest number first. If you have an ALLOW rule at 100 for all traffic, putting your DENY at 50 means it gets evaluated first. The DENY wins.

Click Save changes. You’ve just blocked all HTTP traffic to any instance in subnets associated with this NACL.

The misconfiguration pitfall here is rule number precedence. I’ve seen engineers add DENY rules with numbers higher than existing ALLOW rules, then wonder why traffic still flows. Rule order matters enormously with NACLs.

Step 4: Create Matching Outbound Rules

Here’s where stateless behavior matters. Add a matching outbound DENY rule: Rule number 50, Type as Custom TCP, Port range 1024-65535, Destination 0.0.0.0/0, and Action DENY.

Wait, why port range 1024-65535? Because HTTP responses don’t go back out on port 80. The client uses an ephemeral port, and the server responds to that port. With stateless filtering, you need to explicitly handle these response ports.

This is the single most common NACL mistake. Engineers block inbound traffic on port 443, forget about outbound ephemeral ports, and break return traffic.

Step 5: Demonstrate Stateless Behavior Through Testing

SSH into an EC2 instance in the affected subnet. Try to curl an external HTTP endpoint:

curl http://example.com

The connection should hang or timeout. Now here’s the interesting part: try curling an HTTPS endpoint on port 443:

curl https://example.com

This should still work because we only blocked port 80.

Now remove the outbound DENY rule but keep the inbound DENY. Try the HTTP curl again. It still fails, but for a different reason. The request goes out, but the response can’t come back in because we’re blocking inbound port 80.

Understanding this difference is crucial for troubleshooting. The symptom looks the same, but the root cause is completely different.

Step 6: Blackhole Traffic Intentionally

Let’s create a complete blackhole. Add inbound and outbound DENY rules for all traffic: Rule number 10, Type as All traffic, Source/Destination 0.0.0.0/0, and Action DENY.

Any instance in this subnet is now completely isolated. No traffic in, no traffic out. This is useful for incident response when you need to quarantine a compromised instance immediately.

In production, I’ve used this technique to isolate instances suspected of being part of a botnet while forensics teams investigated. It’s faster than terminating the instance and preserves evidence.

Step 7: Enable VPC Flow Logs

Navigate to your VPC, select it, and click the Flow logs tab. Click Create flow log.

Configure with Filter set to All to capture both accepted and rejected traffic. Destination should be CloudWatch Logs, and you’ll need to create a new log group called something like /vpc/flow-logs/lab-2-2.

The IAM role needs permissions to publish to CloudWatch Logs. AWS provides a sample policy in the documentation.

Set Maximum aggregation interval to 1 minute for faster feedback during this lab. In production, 10 minutes is often sufficient and costs less.

Step 8: Analyze ACCEPT vs REJECT Entries

Generate some traffic from your EC2 instance, then navigate to CloudWatch Logs and find your log group.

Flow Log entries look like this:

2 123456789012 eni-abc123 10.0.1.50 10.0.2.100 443 49152 6 10 840 1616173200 1616173260 ACCEPT OK

The key fields are the source and destination IPs, ports, and the action (ACCEPT or REJECT). When you see REJECT for traffic you expect to work, you’ve found your smoking gun.

Parse these entries carefully. The source port 49152 is an ephemeral port, indicating this is a response packet. If you’re seeing REJECT on response packets, check your outbound NACL rules for ephemeral port ranges.

📍 Technical Note: VPC Flow Logs capture traffic at the Elastic Network Interface (ENI) level after NACL evaluation but before Security Group enforcement. This means a REJECT in Flow Logs indicates NACL blocking, while traffic that passes NACLs but gets blocked by Security Groups still shows as ACCEPT in Flow Logs.

Real Lab Experiences: Architect Insights

Let me share some war stories that might save you hours of troubleshooting.

The phantom connectivity issue happened when an engineer added a DENY rule at rule number 90 to block a specific IP range. The problem was that the existing ALLOW rule for the same port was at rule number 100. Everything looked correct at first glance, but the DENY was winning because 90 is less than 100. The fix was simple once we understood it: renumber the ALLOW to 80. But finding it took two hours of checking Security Groups that were perfectly fine.

The ephemeral port disaster is one I see at least once a quarter. A security team adds NACL rules to meet compliance requirements. They add explicit ALLOW rules for inbound 443 and outbound 443. Tests pass initially because the first connection works. But within minutes, connections start failing randomly. The issue? Outbound 443 isn’t where responses go. They go to ephemeral ports 1024-65535. The security team didn’t understand stateless filtering.

My advice to junior engineers is to always check NACLs before Security Groups when troubleshooting. NACLs are evaluated first. If traffic is blocked at the NACL, it never reaches your instance, and Security Groups are irrelevant. Get in the habit of running aws ec2 describe-network-acls as your first troubleshooting command.

Validation and Testing

From your EC2 instance, run these tests to confirm NACL behavior:

# Test HTTP connectivity (should fail if port 80 is blocked)
curl -v --connect-timeout 5 http://example.com

# Test HTTPS connectivity (should work if only port 80 is blocked)
curl -v --connect-timeout 5 https://example.com

# Test ICMP (ping) to verify all-traffic blocks
ping -c 4 8.8.8.8

# Test connectivity to another instance in a different subnet
telnet 10.0.2.50 22

When traffic is blocked by NACLs, you’ll typically see connection timeouts rather than connection refused messages. This is because the packets are silently dropped. Connection refused would indicate the packet reached the destination but was rejected by the application or Security Group.

Troubleshooting Guide

When traffic fails despite open Security Groups, run this command to inspect your NACLs:

aws ec2 describe-network-acls --filters "Name=vpc-id,Values=vpc-xxxxxxxx"

Look at the rule numbers and their order. Remember, lower numbers are evaluated first. Check both inbound and outbound rules for the relevant ports.

If you suspect missing ephemeral port rules, verify that your outbound rules allow ports 1024-65535 for response traffic. Many compliance templates forget this.

For rule number precedence issues, map out your rules from lowest to highest. A DENY at rule 50 will override an ALLOW at rule 100 for matching traffic.

When traffic flows to the wrong NACL, verify subnet associations:

aws ec2 describe-network-acls --query 'NetworkAcls[*].Associations'

Make sure your subnet is associated with the NACL you think it is.

For Flow Log analysis, look for REJECT entries:

aws logs filter-log-events --log-group-name /vpc/flow-logs/lab-2-2 --filter-pattern "REJECT"

Each REJECT entry tells you exactly which traffic was blocked. Match the source IP, destination IP, and ports to your NACL rules to identify the culprit.

AWS Best Practices: Solutions Architect Perspective

From a security standpoint, implement defense-in-depth by using both NACLs and Security Groups. NACLs provide subnet-level protection against entire CIDR ranges, while Security Groups offer instance-level granularity.

In large-scale environments on Amazon Web Services, NACLs are often mandated by compliance teams for regulatory frameworks like PCI-DSS and HIPAA. Having explicit DENY rules documented and auditable satisfies many compliance requirements that Security Groups alone cannot address.

For reliability, document your NACL rules with clear descriptions and use consistent rule numbering schemes. I recommend increments of 10 (10, 20, 30) to leave room for inserting rules later without renumbering everything.

Operational excellence demands that you enable VPC Flow Logs on all production VPCs. The cost is minimal compared to the troubleshooting time saved. Store logs in S3 for long-term retention and CloudWatch for real-time analysis.

Cost optimization for Flow Logs involves using 10-minute aggregation intervals in production and filtering to capture only rejected traffic if you’re watching costs closely. However, capturing all traffic provides better debugging capability.

From a performance standpoint, NACLs add negligible latency. They’re processed at the hypervisor level and don’t impact instance performance.

When designing for multi-account architectures, consider that NACLs are VPC-specific. If you’re using AWS Transit Gateway or VPC Peering, traffic crossing these connections is still subject to NACLs in each VPC. Plan your rule sets accordingly.

Tag your NACLs with environment, owner, and purpose tags. During an outage, being able to quickly identify which NACL belongs to which team saves precious minutes.

AWS NACL Interview Questions

If you’re preparing for AWS Solutions Architect or DevOps Engineer interviews, these questions come up frequently. I’ve asked many of these myself when hiring.

Q1: What happens if you have a DENY rule at rule number 100 and an ALLOW rule at rule number 200 for the same traffic?

The DENY rule wins. NACLs evaluate rules in ascending order by rule number, and the first matching rule is applied. Since 100 comes before 200, traffic matching both rules will be denied. This is one of the most common misconfigurations in production environments.

Q2: An application works fine with Security Groups but breaks when you add NACLs. What’s the most likely cause?

Missing outbound rules for ephemeral ports. Security Groups are stateful, so return traffic is automatically allowed. NACLs are stateless, meaning you need explicit outbound rules for ports 1024-65535 to allow response traffic back to clients.

Q3: How would you quickly isolate a compromised EC2 instance at the network level?

Create a dedicated “quarantine” NACL with DENY rules for all traffic at rule number 1 (lowest priority). Associate the compromised instance’s subnet with this NACL, or move the instance to an isolated subnet already associated with restrictive NACLs. This is faster than modifying Security Groups and provides subnet-level isolation.

Q4: You see REJECT entries in VPC Flow Logs, but your Security Groups allow the traffic. Where should you look?

NACLs. Flow Logs capture traffic after NACL evaluation but before Security Group enforcement. A REJECT in Flow Logs means the NACL blocked the traffic before it ever reached the Security Group evaluation. Check inbound NACLs for request traffic and outbound NACLs for response traffic.

Q5: Can a single subnet be associated with multiple NACLs?

No. A subnet can only be associated with one NACL at a time. However, a single NACL can be associated with multiple subnets. When you change a subnet’s NACL association, the new rules take effect immediately with no transition period.

Q6: What’s the difference between the default NACL and a custom NACL in terms of default behavior?

The default NACL allows all inbound and outbound traffic by default. Custom NACLs deny all traffic by default until you add explicit ALLOW rules. This catches many engineers off guard when they create their first custom NACL and wonder why all traffic stopped.

Q7: How do NACLs behave with VPC Peering connections?

Traffic crossing VPC Peering connections is subject to NACLs in both VPCs. The source VPC’s outbound NACL rules are evaluated, followed by the destination VPC’s inbound NACL rules. Both must allow the traffic for it to succeed.

Frequently Asked Questions (FAQs)

What is AWS NACL and how does it work?

AWS Network ACL (NACL) is a stateless firewall that operates at the subnet level within a VPC. Unlike Security Groups that attach to individual EC2 instances, NACLs control traffic entering and leaving entire subnets. Each NACL contains numbered rules that are evaluated in order from lowest to highest, and the first matching rule determines whether traffic is allowed or denied. Because NACLs are stateless, you must create explicit rules for both inbound requests and outbound responses.

What is the difference between Security Groups and NACLs?

Security Groups are stateful firewalls that operate at the instance level, meaning return traffic is automatically allowed regardless of outbound rules. NACLs are stateless firewalls at the subnet level, requiring explicit rules for both directions. Security Groups only support ALLOW rules, while NACLs support both ALLOW and DENY rules. NACLs are evaluated before Security Groups in the traffic flow, so traffic blocked by a NACL never reaches the Security Group for evaluation.

Why is my EC2 instance not accessible even with correct Security Group rules?

The most common cause is NACL misconfiguration. Check whether the subnet’s NACL has DENY rules with lower rule numbers than your ALLOW rules, as lower numbers are evaluated first. Also verify that outbound NACL rules allow ephemeral ports 1024-65535 for response traffic. Use VPC Flow Logs to confirm whether traffic is being rejected at the NACL level by looking for REJECT entries.

How do I troubleshoot NACL issues using VPC Flow Logs?

Enable VPC Flow Logs on your VPC with the filter set to capture all traffic. When troubleshooting, look for REJECT entries in your Flow Logs, which indicate traffic blocked by NACLs. Each log entry shows source IP, destination IP, ports, and the accept/reject status. Match rejected traffic against your NACL rules to identify which rule is causing the block. Remember that Flow Logs capture traffic after NACL evaluation, so REJECT means NACL blocking specifically.

What are ephemeral ports and why do they matter for NACLs?

Ephemeral ports are temporary ports (typically 1024-65535) that clients use when initiating connections. When your EC2 instance responds to a request, it sends the response to the client’s ephemeral port, not the original destination port. Because NACLs are stateless, you must explicitly allow outbound traffic on ephemeral ports for responses to reach clients. Forgetting this rule is the most common cause of NACL-related connectivity issues.

Can NACLs block traffic between subnets in the same VPC?

Yes. Traffic between subnets within the same VPC passes through NACLs. When an instance in Subnet A communicates with an instance in Subnet B, the traffic must pass Subnet A’s outbound NACL rules and Subnet B’s inbound NACL rules. Both NACLs must allow the traffic for communication to succeed. This makes NACLs useful for creating network segmentation within a single VPC.

How do I create a DENY rule in AWS NACL?

Navigate to VPC Console, select Network ACLs, choose your NACL, and click Edit inbound or outbound rules. Add a new rule with a rule number lower than any existing ALLOW rule for the same traffic. Set the Type to match your traffic (such as Custom TCP), specify the port range and source/destination CIDR, and select DENY as the action. Remember that rule numbers determine evaluation order, with lower numbers evaluated first.

What happens when I change a subnet’s NACL association?

The new NACL rules take effect immediately with no transition period or overlap. All traffic to and from instances in that subnet will instantly be governed by the new NACL’s rules. This makes NACL changes useful for rapid incident response, but it also means mistakes can immediately break production traffic. Always verify NACL rules before changing subnet associations.

Conclusion and Next Steps

You’ve now experienced firsthand what happens when NACL rules go wrong, and more importantly, how to diagnose these issues. You understand why stateless filtering requires explicit rules for both directions, how rule numbering determines evaluation order, and why VPC Flow Logs are indispensable for troubleshooting.

These aren’t theoretical concepts. They’re the skills that will help you diagnose real production issues quickly and confidently. The next time you’re on call and traffic mysteriously stops flowing, you’ll know exactly where to look.

In the next lab, we’ll dive deep into the comparison between Security Groups and NACLs. You’ll learn exactly when to use each one, how they interact, and how to design layered security that leverages both effectively.


Related Resources:

Similar Posts

Leave a Reply