AWS Direct Connect Tutorial (2025): Architecture, VPN Failover, VIFs & BGP Best Practices
A hands-on technical deep dive for Senior DevOps Engineers and Network Architects preparing for AWS ANS-C01 and SAA-C03 certifications.
Introduction: The “Slow VPN” Struggle
Your startup just hit 500 users. The morning standup video call is lagging, and the database sync is timing out every fifteen minutes. The DevOps Slack channel is blowing up with complaints about “the network being slow again.” Sound familiar?
You trace the problem back to your Site-to-Site VPN. It worked beautifully when you had 50 employees and a single RDS instance. Now, with terabytes of data flowing between your on-premises data center and AWS, that encrypted tunnel over the public internet has become your bottleneck.
AWS Direct Connect (DX) is a dedicated, high-bandwidth private connection between your on-premises network and AWS. This guide explains DX architecture, VIFs (Private, Public, Transit), BGP routing, VPN failover, encryption, Direct Connect pricing, and real-world enterprise designs for AWS hybrid connectivity.
Think of it this way: a VPN is like taking the public highway during rush hour. Direct Connect is a private subway tunnel built just for you. Same destination, predictable journey time, and no competition for bandwidth.
Table of Contents
TL;DR Summary
For those who need the essentials quickly:
What is AWS Direct Connect? A dedicated fiber connection bypassing the public internet, providing consistent low-latency connectivity between on-premises and AWS.
Three VIF types exist: Private VIF (VPC access via private IPs), Public VIF (AWS public services like S3), and Transit VIF (multi-VPC via Transit Gateway).
BGP is mandatory. Every Direct Connect VIF requires BGP peering — no static routes allowed.
Encryption isn’t automatic. DX is private but unencrypted by default. Use MACsec (Layer 2) or IPsec over DX (Layer 3) for encryption.
Always plan for redundancy. Use Active/Passive (DX + VPN backup) or Active/Active (dual DX) architectures for production workloads.
Pricing has two components: Port-hour fees (the cable being plugged in) plus Data Transfer Out charges (cheaper than internet egress).
AWS Direct Connect vs Site-to-Site VPN
Before diving into Direct Connect architecture, let’s clarify when to use DX versus VPN. This comparison drives most hybrid connectivity decisions.
| Aspect | AWS Direct Connect | Site-to-Site VPN |
|---|---|---|
| Connection Type | Dedicated private fiber | Encrypted tunnel over public internet |
| Bandwidth | 50 Mbps to 100 Gbps | Up to 1.25 Gbps per tunnel (can aggregate) |
| Latency | Consistent, low (typically 1-5ms) | Variable, depends on internet conditions |
| Encryption | Not encrypted by default | Always encrypted (IPsec) |
| Setup Time | 2-8 weeks (physical provisioning) | Minutes to hours |
| Cost Model | Port-hour + DTO | Hourly connection fee + DTO |
| Best For | High-throughput, latency-sensitive workloads | Quick setup, backup connectivity, low bandwidth |
| Redundancy Role | Primary path | Backup path (typically) |
When to choose Direct Connect: Your workloads require consistent bandwidth above 500 Mbps, latency-sensitive applications (databases, real-time analytics), or you’re transferring large datasets regularly (backup, replication, data lake ingestion).
When to choose VPN: You need connectivity within hours, your bandwidth requirements are modest, you’re building a proof-of-concept, or you need encrypted connectivity as a DX backup.
The hybrid answer: Most production architectures use both — Direct Connect as the primary path with VPN as failover.
Key Takeaway: Direct Connect isn’t a replacement for VPN; they serve complementary roles. DX provides the performance, VPN provides the encryption and rapid failover capability.
Architecture: What Actually is Direct Connect?
Before touching the AWS console, let’s understand what physically happens when you provision AWS Direct Connect. This isn’t software-defined networking magic — it’s actual fiber optic cable connecting your infrastructure to AWS.
The Physical Layer
The connection path:
Your Router → Customer Gateway (CGW) → Colocation Facility → AWS Direct Connect Location → AWS Cage → Cross Connect → AWS Network
AWS maintains equipment in over 100 Direct Connect locations worldwide — colocation facilities like Equinix, CoreSite, and Digital Realty. You either colocate your own router in the same facility, or work with a partner who has already done so.
The Cross Connect is the physical cable running from your cage (or your partner’s cage) to the AWS cage within that colocation facility. When you order Direct Connect, AWS provides a Letter of Authorization – Connecting Facility Assignment (LOA-CFA) that you hand to the colocation provider. They physically plug in the cable.
![Direct Connect Physical Architecture: On-Prem Router connects through Customer Gateway to Colocation Facility, then via Cross Connect (Cyan line) to AWS Direct Connect Location, terminating at AWS network. Brand colors: Cyan (#00D9F5) for DX path, Dark Navy (#0C1A2B) for on-prem components.]
Senior Engineer Tip: The LOA-CFA process isn’t instant. Budget 2-4 weeks for the physical cross-connect provisioning. I’ve seen projects delayed because someone assumed Direct Connect was as quick as spinning up a VPN.
Dedicated Connection vs Hosted Connection
This distinction fundamentally changes your AWS Direct Connect architecture:
| Aspect | Dedicated Connection | Hosted Connection |
|---|---|---|
| Port Ownership | You own the physical port | Partner owns the port |
| Capacity Options | 1 Gbps, 10 Gbps, or 100 Gbps | 50 Mbps to 10 Gbps (granular) |
| Hardware | Dedicated physical port on AWS router | Sub-interface on partner’s port |
| Lead Time | 4-8 weeks (physical provisioning) | Days to weeks (partner dependent) |
| Best For | High-throughput, predictable workloads | Variable bandwidth, faster deployment |
| VIF Flexibility | Create multiple VIFs yourself | Partner creates VIFs for you |
| MACsec Support | Yes (10G/100G) | No |
Choose Dedicated when running 24/7 production workloads with consistent high bandwidth, when you need the full 10 Gbps pipe without sharing hardware, or when MACsec encryption is required.
Choose Hosted when you need Direct Connect quickly, bandwidth requirements are under 1 Gbps, or you’re testing DX before committing to dedicated infrastructure.
Key Takeaway: Dedicated gives you control and capacity; Hosted gives you speed and flexibility. Most enterprises start with Hosted to prove the concept, then migrate to Dedicated for production.
Direct Connect Gateway Explained
A Direct Connect Gateway is a globally available logical resource that enables a single Direct Connect connection to reach multiple VPCs across different AWS regions. Without it, you’d need separate DX connections for each region — expensive and operationally complex.
How Direct Connect Gateway Works
The DX Gateway sits between your Private VIF (or Transit VIF) and your VPCs. It acts as a routing hub that can associate with:
- Virtual Private Gateways (VGWs) in up to 10 VPCs across any AWS region
- Transit Gateways for large-scale multi-VPC connectivity
Important limitation: A DX Gateway doesn’t enable transitive routing between VPCs. If VPC-A and VPC-B both connect to your DX Gateway, traffic from VPC-A cannot reach VPC-B through the gateway. For transitive routing, you need a Transit Gateway.
![Direct Connect Gateway Architecture: Single DX connection (Cyan) connects to DX Gateway (Royal Blue #2A65F5), which branches to VGW in us-east-1 and VGW in eu-west-1. Each VGW connects to its respective VPC. Arrows show bidirectional traffic flow.]
When to Use Direct Connect Gateway
Use a DX Gateway when you need to reach VPCs in multiple regions from a single physical location, when you want to consolidate connectivity costs by using one DX connection for multiple VPCs, or when your VPCs are in different accounts (DX Gateway supports cross-account associations).
Key Takeaway: Direct Connect Gateway solves multi-region connectivity without multiple physical connections, but doesn’t provide VPC-to-VPC transitive routing. For that, combine it with Transit Gateway.
Direct Connect with Transit Gateway (TGW)
For enterprise-scale AWS hybrid connectivity with dozens or hundreds of VPCs, the combination of Direct Connect and Transit Gateway is the modern standard.
Transit VIF + DX Gateway + Transit Gateway
This architecture pattern connects your on-premises network to a Transit Gateway via:
- Transit VIF on your Direct Connect connection
- Direct Connect Gateway as the intermediary
- Transit Gateway attached to all your VPCs
The Transit Gateway provides the transitive routing that DX Gateway alone cannot. Traffic from on-premises can reach any VPC attached to the TGW, and VPCs can communicate with each other through the TGW.
Architecture flow:
On-Prem Router → Direct Connect → Transit VIF → DX Gateway → Transit Gateway → VPCs
![Transit Gateway Integration: On-prem connects via DX (Cyan) to Transit VIF, then DX Gateway (Royal Blue), then Transit Gateway (center hub). TGW connects to Production VPC, Development VPC, Shared Services VPC, and Security VPC in a hub-and-spoke pattern.]
Transit VIF Limits
Connection Limit: You can create 1 Transit VIF per Hosted Connection (or 4 per Dedicated Connection).
Association Limit: A single Direct Connect Gateway can associate with up to 6 Transit Gateways (allowing you to span 6 regions or accounts).
MTU: Transit Gateway supports 8500-byte packets. Enable Jumbo Frames (MTU 8500+) on your router for optimal throughput.
1 Physical Pipe -> 1 Transit VIF -> 1 DX Gateway -> 6 TGWs -> Many VPCs
Virtual Interfaces (VIFs): The Most Important Concept
The physical connection is just the pipe. Virtual Interfaces (VIFs) determine what flows through that pipe and where it goes. This is where most engineers get confused — and where certification exams love to test you.
VIFs are logical channels multiplexed over your physical connection using 802.1Q VLAN tags. You can run multiple VIFs over one Direct Connect, but each serves a specific purpose.
![VIF Types Comparison: Three parallel paths showing Private VIF (connects to VGW/DX Gateway, uses private IPs, accesses VPCs), Public VIF (connects to AWS public services, uses public IPs, accesses S3/DynamoDB), and Transit VIF (connects to Transit Gateway via DX Gateway, uses private IPs, accesses multiple VPCs with transitive routing).]
Private VIF
A Private VIF connects your on-premises network to a private VPC via a Virtual Private Gateway or Direct Connect Gateway.
Characteristics:
- Uses private IP addressing (RFC1918: 10.x.x.x, 172.16.x.x, 192.168.x.x)
- Traffic never touches the public internet
- Requires BGP peering between your router and AWS
- Can connect to a single VGW, or to a DX Gateway for multi-VPC/multi-region access
- Supports jumbo frames (9001 MTU)
Use case: Accessing EC2 instances, RDS databases, internal ALBs — anything with a private IP inside your VPC.
Public VIF
A Public VIF connects your network to AWS public services (S3, DynamoDB, EC2 public endpoints, SQS, SNS) without traversing the public internet.
Why use Direct Connect for public endpoints? Bandwidth and consistency. If you’re transferring terabytes to S3 daily, doing it over DX means predictable throughput without competing with public internet traffic.
Characteristics:
- Uses public IP addressing (you advertise your public prefixes, AWS advertises theirs)
- Traffic stays on AWS backbone, not public internet
- Requires you to own public IP space (or use AWS-provided addresses)
- AWS advertises thousands of prefixes (all public service IPs)
Senior Engineer Tip: Public VIFs receive over 1,000 BGP prefixes from AWS. Verify your router can handle the table size, and consider prefix filtering to accept only routes for services you actually use.
Transit VIF
A Transit VIF connects your Direct Connect to a Transit Gateway via a Direct Connect Gateway. This is the approach for connecting one DX to dozens or hundreds of VPCs.
Characteristics:
- Requires a DX Gateway as intermediary
- Supports transitive routing (on-prem → TGW → VPC-A → VPC-B)
- Maximum 6 Transit VIFs per DX Gateway
- Essential for enterprise-scale hybrid architectures
| VIF Type | Connects To | IP Addressing | Transitive Routing | Primary Use Case |
|---|---|---|---|---|
| Private VIF | VGW or DX Gateway | Private (RFC1918) | No | Single VPC or small multi-VPC |
| Public VIF | AWS Public Services | Public IPs | N/A | S3/DynamoDB bulk transfers |
| Transit VIF | Transit Gateway (via DX Gateway) | Private | Yes | Enterprise multi-VPC (50+ VPCs) |
Quiz Check: If you need to access an S3 bucket over Direct Connect without deploying a VPC Endpoint, which VIF type do you use? (Answer: Public VIF)
Key Takeaway: Private VIF for VPC resources, Public VIF for AWS services without internet, Transit VIF for enterprise-scale multi-VPC. Choose based on what you’re connecting to and your scale requirements.
The Routing Brain: BGP and ASNs
Direct Connect doesn’t support static routes. Every VIF requires BGP (Border Gateway Protocol) peering. If you’ve only worked with AWS-native networking, this might be your first real encounter with BGP.
Why BGP for AWS Direct Connect?
BGP is the standard for exchanging routing information between autonomous networks. When your on-premises router peers with AWS via BGP, both sides dynamically learn reachable prefixes. If a route fails, BGP detects it and updates routing tables automatically — essential for failover scenarios.
ASN Configuration
Every BGP speaker needs an ASN (Autonomous System Number):
AWS Side ASN: For private VIFs, AWS defaults to 64512 (private ASN). You can specify custom ASNs when creating certain resources.
Customer Side ASN: Your router’s ASN. Options include:
- Private ASN range: 64512–65534 (commonly used)
- Public ASN: If you own one (typical for enterprises with existing BGP presence)
AWS BGP Configuration Example
When you create a VIF, AWS provides:
- AWS router’s BGP peer IP
- Your router’s BGP peer IP
- VLAN ID
- BGP authentication key (MD5)
Cisco IOS configuration:
! Define BGP process with your ASN
router bgp 65001
! Peer with AWS
neighbor 169.254.100.1 remote-as 64512
! MD5 authentication from VIF configuration
neighbor 169.254.100.1 password YOUR_BGP_AUTH_KEY
! IPv4 unicast address family
address-family ipv4 unicast
! Advertise on-premises CIDR to AWS
network 10.0.0.0 mask 255.255.0.0
! Activate neighbor
neighbor 169.254.100.1 activate
exit-address-family
Juniper JUNOS configuration:
protocols {
bgp {
group AWS-DX {
type external;
peer-as 64512;
local-as 65001;
neighbor 169.254.100.1 {
authentication-key "YOUR_BGP_AUTH_KEY";
export advertise-to-aws;
}
}
}
}
policy-options {
policy-statement advertise-to-aws {
term on-prem-networks {
from {
route-filter 10.0.0.0/16 exact;
}
then accept;
}
}
}
Senior Engineer Tip: Always use BGP MD5 authentication. Without it, anyone who can inject packets on your Direct Connect VLAN could potentially advertise bogus routes. Defense in depth matters.
Key Takeaway: BGP is mandatory for Direct Connect. Plan your ASN strategy, configure MD5 authentication, and ensure your router can handle the prefix counts (especially for Public VIFs).
Direct Connect Redundancy Models
A single Direct Connect connection, no matter how beefy, remains a single point of failure. For production workloads, you need redundancy. AWS offers several Direct Connect redundancy patterns.
Active/Passive: DX + VPN Backup
The most common pattern for cost-conscious resilient design:
Primary Path: Direct Connect (1 Gbps or higher)
Backup Path: Site-to-Site VPN (IPsec) over public internet
Both paths advertise the same on-premises prefixes to AWS, but BGP attributes ensure traffic strongly prefers DX. Only when DX fails does traffic shift to VPN.
![Active/Passive Failover: On-prem router connects via thick Cyan line (Primary DX, 1Gbps) to Transit Gateway, and via dotted Grey line (Backup VPN) through Internet cloud to VGW attached to same TGW. Labels show “BGP AS Path: 65001 → Preferred” for DX and “BGP AS Path: 65001 65001 65001 → Failover Only” for VPN.]
Failover control techniques:
AS Path Prepending: Make the VPN path appear “longer” by prepending your ASN multiple times. AWS sees DX as “65001” and VPN as “65001 65001 65001” — it chooses the shorter path (DX).
Local Preference: On your router, set higher Local Preference for routes learned via DX. Higher preference wins.
Active/Active: Dual Direct Connect
For maximum bandwidth and resilience, deploy two Direct Connect connections:
Option 1: Same DX Location
- Two connections in the same colocation facility
- Protects against port/device failure
- Doesn’t protect against facility-level outages
Option 2: Different DX Locations
- Connections in geographically separate facilities
- Protects against facility outages, fiber cuts, regional events
- Higher cost but maximum resilience
Option 3: Multi-Region DX with DX Gateway
- DX connections in different regions
- DX Gateway routes to nearest connection
- Ultimate geographic redundancy
Failover Timing Optimization
Default BGP hold timers (180 seconds) mean up to 3 minutes before failover. For faster recovery:
- Configure aggressive BGP timers (10-second hold, 3-second keepalive)
- Enable BFD (Bidirectional Forwarding Detection) for sub-second failure detection
- Test failover scenarios regularly
Key Takeaway: Active/Passive (DX + VPN) balances cost and resilience for most workloads. Active/Active dual-DX is justified for mission-critical applications where minutes of downtime cost more than the second connection.
How to Configure Direct Connect (Step-by-Step)
Here’s the end-to-end process for provisioning AWS Direct Connect:
Step 1: Order Dedicated or Hosted Connection
For Dedicated Connection:
- Open AWS Console → Direct Connect → Create Connection
- Select connection type: Dedicated
- Choose port speed (1G, 10G, or 100G)
- Select Direct Connect location nearest to your data center
- Submit order — AWS begins provisioning
For Hosted Connection:
- Contact an AWS Direct Connect Partner
- Specify bandwidth requirements (50 Mbps – 10 Gbps)
- Partner creates the connection and shares it to your AWS account
- Accept the connection in your console
Step 2: Obtain LOA-CFA
After AWS provisions your port (Dedicated) or your partner provisions capacity (Hosted), download the Letter of Authorization – Connecting Facility Assignment (LOA-CFA) from the console.
This document authorizes the colocation provider to install the cross-connect cable.
Step 3: Configure Cross-Connect
Submit the LOA-CFA to your colocation provider. They’ll:
- Schedule the cross-connect installation
- Run fiber from your cage to the AWS cage
- Test physical connectivity
- Notify you when complete
Timeline: 1-4 weeks depending on the facility.
Step 4: Create Virtual Interface (VIF)
Once the physical connection shows “Available”:
- Navigate to Virtual Interfaces → Create Virtual Interface
- Select VIF type (Private, Public, or Transit)
- Configure:
- VLAN ID
- BGP ASN (yours and AWS’s)
- BGP authentication key
- IP addresses for BGP peering
- Prefixes to advertise (for Public VIF)
- Associate with VGW, DX Gateway, or Transit Gateway
Step 5: Configure BGP on Your Router
Using the parameters from Step 4, configure your on-premises router:
- Create BGP neighbor relationship with AWS peer IP
- Apply MD5 authentication key
- Define networks to advertise
- Configure route policies (prefix filtering, AS path manipulation)
- Verify BGP session establishes (state: Established)
Step 6: Attach to VGW or Transit Gateway
For Private VIF → VGW:
- VGW must already be attached to target VPC
- VIF association happens during VIF creation
For Transit VIF → Transit Gateway:
- Create DX Gateway first
- Associate DX Gateway with Transit Gateway
- Create Transit VIF pointing to DX Gateway
Verification:
- Check VIF state: “Available”
- Check BGP status: “Up”
- Test connectivity: ping/traceroute from on-prem to VPC resources
Key Takeaway: The process involves both AWS provisioning (console) and physical work (colocation). Plan for 4-8 weeks total lead time for Dedicated connections.
Direct Connect Security: Encryption Options
A critical clarification: Direct Connect is private but not encrypted by default. Your data travels in cleartext on the fiber. For compliance-driven workloads (PCI-DSS, HIPAA, government), this may not meet requirements.
MACsec (Layer 2 Encryption)
IEEE 802.1AE MACsec encrypts Ethernet frames at Layer 2, providing line-rate encryption without IPsec overhead.
Requirements:
- 10 Gbps or 100 Gbps Dedicated Connection
- MACsec-capable router hardware
- Supported Direct Connect location
Benefits: Wire-speed encryption, no throughput penalty, protects all traffic on the connection.
IPsec over Direct Connect (Layer 3 Encryption)
Run an IPsec VPN tunnel inside your Direct Connect connection. Traffic is encrypted at Layer 3 before hitting the wire.
Implementation: Create a Site-to-Site VPN to your VGW, but route the VPN endpoints over your Public VIF (using public IPs) instead of the internet. You get DX bandwidth plus IPsec encryption.
Trade-off: Adds CPU overhead, slightly reduces effective throughput compared to MACsec.
Compliance Considerations
PCI-DSS: Requires encryption of cardholder data in transit. MACsec or IPsec over DX satisfies this requirement.
HIPAA: Requires encryption of PHI in transit. Same solutions apply.
Financial Services: Many regulations require encrypted WAN connections. MACsec is preferred for performance-sensitive trading systems.
Key Takeaway: DX is private (not on public internet) but unencrypted by default. Use MACsec for high-performance encryption or IPsec over DX for flexible Layer 3 encryption.
Direct Connect + SD-WAN Integration
For organizations using SD-WAN platforms (Cisco Viptela, Palo Alto Prisma, Fortinet, VMware VeloCloud), Direct Connect serves as a high-performance underlay.
Integration Patterns
Pattern 1: DX as SD-WAN Underlay
Your SD-WAN edge devices establish overlay tunnels (IPsec/GRE) across the Direct Connect connection. The DX provides reliable, high-bandwidth transport; the SD-WAN provides application-aware routing and encryption.
Pattern 2: Direct DX Integration
Some SD-WAN platforms integrate directly with AWS Transit Gateway. Traffic flows from SD-WAN edge → DX → Transit Gateway without additional overlay tunnels for AWS-destined traffic.
Pattern 3: Hybrid Underlay
Use DX for high-priority traffic (voice, video, database replication) and internet/MPLS for best-effort traffic. SD-WAN policies route applications to the appropriate path.
Vendor Considerations
Cisco SD-WAN (Viptela): Supports AWS Transit Gateway Connect for native integration. Can also run vEdge Cloud in AWS as a hub.
Palo Alto Prisma SD-WAN: CloudBlades enable direct cloud connectivity. DX serves as preferred underlay for Prisma Access.
Fortinet SD-WAN: FortiGate VMs in AWS can terminate SD-WAN tunnels. DX provides the WAN transport.
Key Takeaway: Direct Connect complements SD-WAN by providing a reliable, high-bandwidth underlay. Most enterprises use DX for AWS traffic while maintaining internet/MPLS paths for other destinations.
Direct Connect Pricing: The Hidden Costs
Direct Connect pricing catches people off guard. It’s not just “pay for the port.”
Two Primary Cost Components
Port-Hour Charges: Flat hourly fee for having the physical port provisioned.
| Port Speed | Approximate Hourly Rate (us-east-1) | Monthly Estimate |
|---|---|---|
| 1 Gbps | $0.30/hour | ~$220/month |
| 10 Gbps | $2.25/hour | ~$1,620/month |
| 100 Gbps | $22.50/hour | ~$16,200/month |
This fee applies whether you’re pushing data or not.
Data Transfer Out (DTO): Charges for data transferred from AWS to on-premises.
| Transfer Type | Approximate Rate (us-east-1) |
|---|---|
| DX Data Transfer Out | $0.02/GB |
| Internet Data Transfer Out | $0.09/GB |
| Savings | 78% |
Inbound data transfer (on-prem to AWS) is free. AWS wants your data in their cloud.
Additional Costs to Budget
Cross-connect fees: Colocation facility charges for the physical cable. Typically $100-$500/month.
Partner fees: For Hosted Connections, partners add their margin above AWS costs.
Colocation space: If placing your own router, you pay for cage/rack space, power, and cooling.
Redundancy: Dual connections double port costs but may reduce risk-adjusted total cost.
Reflection: Why is inbound data transfer free? AWS benefits when your data lives in their cloud — it creates stickiness and enables consumption of other services. Free inbound removes migration friction.
Key Takeaway: Budget for port-hours + DTO + colocation fees. DX DTO is significantly cheaper than internet egress, making it cost-effective for data-heavy workloads despite the port fees.
Troubleshooting Direct Connect: Common Issues
Years of DX implementations reveal predictable failure patterns. Here’s how to diagnose and fix them.
BGP Prefix Limit Exceeded
Symptom: BGP session flaps or routes disappear intermittently.
Cause: Your router’s BGP table is full. Public VIFs receive 1,000+ prefixes from AWS.
Fix:
- Verify router BGP table capacity
- Implement prefix filtering to accept only required AWS ranges
- Upgrade router if necessary
Asymmetric Routing
Symptom: Connections work intermittently; stateful firewalls drop packets.
Cause: Outbound traffic takes DX, return traffic comes via VPN (or vice versa).
Fix:
- Ensure BGP attributes create symmetric paths
- Use AS Path Prepending consistently on backup path
- Verify Local Preference settings on both directions
- Check for conflicting static routes
Route Propagation Delays
Symptom: New routes take minutes to appear; VPC resources unreachable after changes.
Cause: BGP convergence time, route table propagation delays.
Fix:
- Enable BFD for faster failure detection
- Reduce BGP timers (hold time, keepalive)
- Verify route propagation is enabled on VGW
MTU Mismatches
Symptom: Small packets work, large transfers fail or fragment.
Cause: Jumbo frames enabled on DX (9001 MTU) but not end-to-end.
Fix:
- Verify MTU settings across entire path
- Test with
ping -M do -s 8972(Linux) to check path MTU - Either enable jumbo frames end-to-end or reduce DX MTU to 1500
LOA-CFA Delays
Symptom: Connection stuck in “Ordering” or “Pending” state.
Cause: Physical provisioning takes time; colocation scheduling backlogs.
Fix:
- Confirm LOA-CFA was submitted to colocation provider
- Follow up with colo for scheduling timeline
- Budget 2-4 weeks for this step in project plans
VIF State Issues
Symptom: VIF shows “Down” or “Deleting” unexpectedly.
Cause: BGP session failed, VLAN mismatch, or underlying connection issue.
Fix:
- Verify physical connection state is “Available”
- Check VLAN ID matches on both ends
- Verify BGP configuration (ASN, peer IPs, auth key)
- Check router logs for BGP error messages
Key Takeaway: Most DX issues trace back to BGP configuration, MTU mismatches, or asymmetric routing. Build a troubleshooting runbook covering these scenarios before go-live.
Real-World Scenario: The Hybrid Enterprise
Let’s synthesize everything into a realistic architecture. You’re the network architect for a financial services firm with:
- Corporate HQ: On-premises data center in New Jersey
- Production: AWS us-east-1 (Virginia)
- Disaster Recovery: AWS eu-west-1 (Ireland)
- Compliance: PCI-DSS requires encrypted transit
The Architecture
Primary Connectivity:
- 10 Gbps Dedicated Connection at Equinix NY5 (New Jersey)
- MACsec enabled for PCI compliance (Layer 2 encryption)
- Transit VIF connected to Direct Connect Gateway
- DX Gateway associated with Transit Gateway in us-east-1
- Same DX Gateway associated with VGW in eu-west-1 (DR)
Redundancy:
- Second 10 Gbps Dedicated Connection at CoreSite NY1 (different facility)
- Site-to-Site VPN as tertiary backup
- AS Path Prepending: Primary DX (no prepend) → Secondary DX (1 prepend) → VPN (3 prepends)
Traffic Flow:
- Production traffic: HQ → Primary DX → TGW → Production VPCs
- DR replication: HQ → Primary DX → DX Gateway → VGW → DR VPC
- Failover: Automatic via BGP, sub-second detection with BFD
Cost Optimization:
- Data Transfer Out via DX: ~$0.02/GB (vs $0.09/GB internet)
- Estimated monthly transfer: 50 TB = $1,000 DX vs $4,500 internet
- Port costs justified by DTO savings alone
Key Takeaway: Enterprise architectures combine multiple DX connections, DX Gateway for multi-region reach, Transit Gateway for VPC scale, and MACsec/IPsec for compliance. Plan redundancy from day one.
Common Mistakes (The “Gotchas”)
Confusing Private VIFs with Transit VIFs: A Private VIF to a DX Gateway doesn’t provide transitive VPC routing. You need a Transit VIF connected to Transit Gateway for that. This distinction trips up many architects.
Underestimating LOA Wait Times: The LOA-CFA process involves AWS generating documents, colocation scheduling, and physical work. This takes weeks, not days. Start early.
Asymmetric Routing with DX + VPN: When running both simultaneously, traffic can take different paths in each direction. This breaks stateful firewalls. Ensure BGP attributes create symmetric paths.
Forgetting BGP Prefix Limits: Undersized routers choke on Public VIF prefix counts. Verify hardware capacity before deployment.
Ignoring Jumbo Frame Consistency: Enabling 9001 MTU on DX without end-to-end support causes fragmentation. Test MTU across the entire path.
Assuming DX is Encrypted: It’s private but unencrypted by default. Plan for MACsec or IPsec if compliance requires encryption.
Conclusion: The Bridge is Built, Now What?
AWS Direct Connect transforms hybrid cloud from “it works, sometimes” to enterprise-grade reliable. You’ve now understood the physical architecture, mastered VIF types, configured BGP peering, designed redundant failover, and learned where others stumble.
But connectivity is only half the battle. Once packets flow between on-premises and AWS, a new challenge emerges: how do names resolve across this bridge?
Your on-premises servers need to resolve aws.internal DNS names. Your AWS workloads need to query on-prem Active Directory DNS. Without proper hybrid DNS, you’ll have connectivity but broken applications.
Up Next: Mastering Route 53 Resolver for Hybrid DNS — Forward and reverse DNS resolution between your data center and AWS.
Foundation Check: Master AWS VPC Fundamentals — If VPCs, subnets, and route tables still feel fuzzy, solidify your foundation first.
Related Guides:
- AWS Site-to-Site VPN Configuration Guide
- Transit Gateway Deep Dive
- BGP Fundamentals for Cloud Engineers
FAQs
What is the difference between Hosted and Dedicated Direct Connect?
A Dedicated Connection gives you a physical port (1/10/100 Gbps) entirely owned by you on AWS’s router. A Hosted Connection is provisioned through an AWS Partner who shares their physical port — you get a VLAN on their connection with speeds from 50 Mbps to 10 Gbps. Dedicated offers maximum control, throughput, and MACsec support; Hosted offers faster deployment and lower commitment.
Does Direct Connect encrypt my data?
No, Direct Connect does not encrypt data by default. Traffic flows in cleartext over the private fiber. For encryption, use MACsec (Layer 2 encryption, available on 10/100 Gbps Dedicated connections) or run an IPsec VPN tunnel over your DX connection (Layer 3 encryption).
Can I access S3 over Direct Connect?
Yes, using a Public VIF. The Public VIF connects your on-premises network to AWS public service endpoints (S3, DynamoDB, etc.) over your Direct Connect link instead of the public internet. This provides consistent bandwidth and lower latency for large data transfers.
How do I backup my Direct Connect link?
The standard approach is Active/Passive redundancy. Configure a Site-to-Site VPN (IPsec) as your backup path. Use BGP AS Path Prepending on the VPN to make it less preferred than Direct Connect. Traffic normally flows over DX, failing over to VPN only when DX becomes unavailable. For higher availability, deploy dual Direct Connect connections in different facilities.
What is a Transit VIF?
A Transit VIF connects your Direct Connect connection to an AWS Transit Gateway via a Direct Connect Gateway. This enables a single DX connection to reach multiple VPCs (potentially hundreds) through the Transit Gateway’s hub-and-spoke architecture with full transitive routing. It’s the preferred approach for enterprise-scale multi-VPC hybrid connectivity.
