EC2 Instance Store Lab— Ephemeral Storage Performance, Data Loss & Real-World Use Cases 2026
Table of Contents: EC2 Instance Store Lab
Introduction
Let me tell you about the time I watched a junior engineer lose 200GB of processed ML training data because nobody explained what “ephemeral” actually means in AWS. That was a long night, and it’s exactly why I’m writing this lab.
This hands-on tutorial teaches you everything about EC2 instance store volumes — the blazing-fast storage that AWS attaches directly to your host machine. You’ll launch an instance-store-backed EC2 instance, write data, deliberately lose it, and benchmark performance against EBS. By the end, you’ll understand exactly when this storage type saves your architecture and when it burns it down.
Who This Lab Is For
If you’re studying for AWS certifications, transitioning into DevOps, or simply tired of reading documentation that doesn’t explain why things work the way they do, you’re in the right place. This AWS beginner lab assumes you can SSH into a server and run basic Linux commands. Everything else, I’ll walk you through.
The Misunderstandings That Cause Outages
Here’s what trips up most engineers: instance store looks like regular disk space. It mounts the same way. Files save the same way. The AWS console doesn’t exactly scream “THIS DATA WILL VANISH” in red letters.
But instance store is physically attached to the host hardware your EC2 runs on. Stop your instance? AWS might migrate you to different hardware when you start it again. Your data stays behind on the old host. Gone. Not recoverable. Not backed up. Just gone.
I’ve seen teams lose cache layers, scratch processing data, and temporary build artifacts because someone treated ephemeral storage like a regular disk. This lab makes sure you never make that mistake.
⚠️ Critical Distinction — Reboot vs Stop:
Rebooting an EC2 instance does NOT erase instance store data. The instance stays on the same physical host.
Stopping and starting an EC2 instance DOES erase instance store data. AWS may migrate you to different hardware.
Many engineers learn this difference the hard way. Bookmark this distinction — it appears on AWS certification exams constantly.
Lab Overview
What You’ll Build and Test
You’re going to launch an EC2 instance backed by instance store, write test data, stop the instance, and confirm that data disappears. Then you’ll run disk benchmarks comparing instance store against EBS gp3 volumes — the performance difference will make your jaw drop.
Skills You’ll Gain
After completing this step-by-step AWS tutorial, you’ll be able to identify which EC2 instance types include instance store volumes, properly format and mount ephemeral storage, understand the stop/start data loss behavior (and why it’s different from reboot), benchmark storage IOPS and throughput, and make informed architectural decisions about AWS EC2 storage.
Real-World Workloads Using Instance Store
Production systems leverage instance store constantly. Think distributed caching layers like Redis clusters, Spark shuffle space, video transcoding scratch disks, or CI/CD build directories. Any workload where data is temporary, regenerable, or replicated across nodes becomes a candidate.
The common thread? Engineers using instance store successfully know it’s temporary. They design for it. The failures happen when someone forgets.
Prerequisites
Before we dive in, make sure you have:
- AWS Account with billing enabled (we’ll use instances that aren’t free-tier eligible)
- IAM Permissions for EC2 full access (or at minimum:
ec2:RunInstances,ec2:DescribeInstances,ec2:StopInstances,ec2:StartInstances,ec2:TerminateInstances) - EC2 Key Pair created in your target region
- Basic Linux command familiarity — navigating directories, running commands as root
- AWS CLI installed and configured (optional but helpful for verification)
💰 Cost Reality Check:
Instance store itself is “free,” but instance types that include it are not. A
c5d.largecosts more than ac5.large— you’re paying for the physical NVMe drives attached to your host.Before choosing instance store, always compare: the instance price difference between “d” and non-“d” variants, the cost of provisioned IOPS EBS if you need similar performance, and whether your workload actually benefits from local storage speed.
Sometimes paying for io2 volumes on a cheaper instance type wins. Do the math for your specific use case.
Plan to complete this lab in one sitting and terminate resources immediately afterward. I’ll remind you at the end.
Step-by-Step Hands-On Lab
Step 1: Identify EC2 Instance Types with Instance Store
Not every EC2 instance type includes instance store. This catches people constantly.
What to do:
- Open the AWS Console → EC2 → Launch Instance
- Before selecting anything, open a new tab to the EC2 Instance Types page
- Look for instance families that mention “NVMe instance storage” — common ones include
c5d,m5d,r5d,i3, andc6id
Why it matters:
The “d” suffix typically indicates instance store. A c5.large has no local storage. A c5d.large includes 50GB of NVMe instance store. Miss this distinction and you’ll launch an instance wondering why lsblk only shows your root volume.
Common misconfiguration:
I’ve reviewed architectures where teams specified m5.xlarge in their Terraform templates, expecting instance store, then couldn’t figure out why performance was terrible. Always verify the instance type specifications.
Step 2: Launch an Instance-Store-Backed EC2 Instance
What to do:
- AWS Console → EC2 → Launch Instance
- Name:
instance-store-lab - AMI: Amazon Linux 2023 (or Amazon Linux 2)
- Instance type:
c5d.large(cheapest option with instance store in most regions) - Key pair: Select your existing key pair
- Network settings: Default VPC, Auto-assign public IP enabled
- Storage: Keep the default 8GB gp3 root volume — do not add the instance store here manually
- Launch the instance
What you should see:
The instance store volumes are automatically attached based on instance type. You don’t add them in the storage configuration wizard — that’s only for EBS volumes.
Why it matters:
Instance store volumes exist because of the physical hardware you’re allocated. AWS handles the attachment automatically. This is fundamentally different from EBS, where you explicitly provision and attach volumes.
Console navigation path:
EC2 Dashboard → Instances → Launch Instance → (complete wizard) → Launch
Step 3: Locate and Verify Instance Store Volumes
SSH into your instance and let’s see what we’re working with.
What to do:
ssh -i your-key.pem ec2-user@<public-ip>
Once connected:
lsblk
What you should see:
NAME SIZE TYPE DISK
nvme0n1 8G disk
├─nvme0n1p1 8G part /
└─nvme0n1p128 1M part
nvme1n1 46.6G disk
The nvme1n1 device (or similar) is your instance store. Notice it has no mount point — it’s raw, unformatted storage.
Common misconfiguration:
If you only see the root volume, double-check your instance type. Run curl http://169.254.169.254/latest/meta-data/instance-type to confirm you’re actually on a d instance.
Step 4: Format and Mount the Instance Store Volume
What to do:
# Check the device is truly empty
sudo file -s /dev/nvme1n1
# Format with XFS filesystem
sudo mkfs -t xfs /dev/nvme1n1
# Create mount point
sudo mkdir -p /mnt/instance-store
# Mount the volume
sudo mount /dev/nvme1n1 /mnt/instance-store
# Verify
df -h /mnt/instance-store
What you should see:
Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 47G 33M 47G 1% /mnt/instance-store
Why it matters:
Unlike EBS volumes that can come pre-formatted, instance store always arrives raw. You format it, you mount it, and critically — you do this every single time the instance starts. The filesystem doesn’t persist.
Important note:
If you want the mount to survive reboots (but not stop/starts — we’ll prove that shortly), you’d add an entry to /etc/fstab. For this lab, we’re keeping it manual to emphasize the temporary nature.
Step 5: Write Test Data to Instance Store
What to do:
# Write a 1GB test file
sudo dd if=/dev/zero of=/mnt/instance-store/testfile bs=1M count=1024
# Create a marker file we can easily check later
echo "This data was written on $(date)" | sudo tee /mnt/instance-store/timestamp.txt
# Verify both files exist
ls -la /mnt/instance-store/
What you should see:
-rw-r--r-- 1 root root 1073741824 Dec 23 10:30 testfile
-rw-r--r-- 1 root root 45 Dec 23 10:30 timestamp.txt
Step 6: Stop the Instance and Start It Again
This is where ephemeral storage earns its name.
What to do:
- Exit your SSH session:
exit - AWS Console → EC2 → Instances → Select
instance-store-lab - Instance state → Stop instance
- Wait until the instance state shows “Stopped”
- Instance state → Start instance
- Wait for “Running” status and grab the new public IP
- SSH back in
Why it matters:
When you stop an EC2 instance, AWS releases the underlying host. When you start it again, you might land on completely different hardware. Your EBS volumes reattach (they’re network-attached storage), but instance store? It was physically on the old host. It’s gone.
Step 7: Verify Data Loss
What to do:
ssh -i your-key.pem ec2-user@<new-public-ip>
# Check for the device
lsblk
# Try to access the old mount point
ls /mnt/instance-store/
What you should see:
The instance store device appears again (same size), but it’s completely empty. The mount point directory still exists (it’s on the root EBS volume), but there’s nothing in it. Your testfile and timestamp.txt are gone.
# The device is back but unformatted
sudo file -s /dev/nvme1n1
Output: /dev/nvme1n1: data — meaning raw, unformatted disk.
This is the lesson. This is what burns teams at 2 AM when someone stops an instance to “save costs over the weekend.”
Instance Store vs EBS: Why Performance Is So Different
Let’s see why people tolerate the ephemeral storage risks. The performance difference is dramatic, and understanding why helps you make better architectural decisions.
The fundamental difference: EBS volumes are network-attached storage. Every read and write travels over the network to remote storage servers. Instance store is physically attached NVMe drives sitting in the same chassis as your CPU. No network hop. No latency penalty.
Step 8: Benchmark Disk Performance
What to do:
First, set up the instance store again:
sudo mkfs -t xfs /dev/nvme1n1
sudo mount /dev/nvme1n1 /mnt/instance-store
Install benchmarking tools:
sudo yum install -y fio
Run the instance store benchmark:
sudo fio --name=randwrite --ioengine=libaio --iodepth=32 --rw=randwrite --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --filename=/mnt/instance-store/fio-test --group_reporting
Now benchmark your EBS root volume for comparison:
sudo fio --name=randwrite --ioengine=libaio --iodepth=32 --rw=randwrite --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --filename=/tmp/fio-test --group_reporting
What you should see:
Instance store will deliver significantly higher IOPS and lower latency. On a c5d.large, expect instance store to hit 20,000+ random write IOPS while your gp3 root volume maxes out around 3,000 IOPS (the gp3 baseline).
Interpreting results:
Look at the IOPS= line in the fio output. The instance store numbers reveal why workloads like database scratch space, caching, and high-throughput processing use ephemeral storage. You simply cannot get this performance from EBS without paying for provisioned IOPS volumes — and even then, you’re still adding network latency.
| Metric | Instance Store (NVMe) | EBS gp3 (Baseline) |
|---|---|---|
| Random Write IOPS | 20,000+ | ~3,000 |
| Latency | Sub-millisecond | 1-2ms typical |
| Throughput | 400+ MB/s | 125 MB/s baseline |
| Data Persistence | None (ephemeral) | Full (durable) |
| Snapshots | Not available | Supported |
Real Lab Experiences — Architect Insights
Let me share what I’ve seen go wrong in production.
The Auto Scaling disaster: A team configured an Auto Scaling group using an instance type with instance store. Their bootstrap script wrote configuration to the instance store because “it was faster.” Every scale-in event destroyed that config. New instances couldn’t read settings from terminated instances. The service degraded under load instead of scaling.
The “I’ll just stop it overnight” incident: Developer stopped a processing instance running a multi-day ML job to save costs. The intermediate checkpoint files? Instance store. Two days of processing, gone. They had to restart from scratch.
The hidden instance store: An engineer inherited infrastructure where previous admins had mounted instance store to /var/lib/docker. Nobody documented it. Routine maintenance stopped the instance. Suddenly, all Docker images and containers vanished. Production services failed to restart.
My advice for junior engineers: Before any EC2 goes to production, run lsblk and understand every volume. If you see unmounted NVMe devices on a “d” instance type, that’s instance store. Ask what’s stored there. Ask what happens if it disappears.
Validation and Testing
Confirm your understanding with these commands:
# Verify instance store device exists
lsblk | grep -v nvme0
# Check mount status
mount | grep instance-store
# Confirm filesystem type
df -T /mnt/instance-store
# Verify data was written
ls -la /mnt/instance-store/
After stop/start, confirm data loss:
# Device exists but is raw
sudo file -s /dev/nvme1n1
# Output should show "data" not "XFS filesystem"
# Mount point is empty
ls /mnt/instance-store/
# Should show nothing
Troubleshooting Guide
Instance launched but no instance store visible:
Check your instance type: curl http://169.254.169.254/latest/meta-data/instance-type. If it doesn’t end in “d” or isn’t from the i3/d2/similar families, you don’t have instance store.
Device shows as /dev/xvd* instead of /dev/nvme*:
Older instance types or AMIs may present devices differently. Run lsblk and look for unmounted devices matching your expected size.
Mount disappeared after reboot:
Instance store survives reboots but requires remounting. Add to /etc/fstab with the nofail option to auto-mount.
Benchmark numbers seem low:
Ensure you’re using --direct=1 in fio to bypass filesystem caching. Also confirm you’re testing the correct device path.
Permission denied errors:
Instance store operations typically require root. Use sudo for format, mount, and write operations.
# Debug device issues
dmesg | grep nvme
sudo nvme list
AWS Best Practices — Solutions Architect Level
Security implications: Data on instance store isn’t encrypted by default on most instance types. For sensitive scratch data, enable NVMe encryption (available on Nitro-based instances) or implement application-level encryption. Also remember: when an instance terminates, AWS doesn’t guarantee the data is wiped before hardware reuse.
Reliability trade-offs: Instance store data is lost on stop, termination, or hardware failure. Never store anything you can’t regenerate. Design workloads assuming the data will vanish without warning.
Performance optimization: Instance store delivers consistent, low-latency performance because it’s local storage. Use it for scratch space, caching, and shuffle operations where performance matters more than durability. The I3 instance family is specifically designed for storage-intensive workloads.
Cost considerations: Instance store is included with certain instance types at no additional cost. However, these “d” instances cost more than their non-“d” equivalents. Calculate whether the performance gain justifies the instance cost delta versus using provisioned IOPS EBS.
When NOT to use instance store: Primary databases, critical application data, logs that must be retained, or anything without replication. If you can’t afford to lose it, don’t put it here.
Auto Scaling considerations: Never store state on instance store in Auto Scaling groups. Scale-in events terminate instances, destroying that data. Use S3, EFS, or databases for shared state.
Disaster recovery: Instance store has no backup mechanism. You cannot snapshot it. Your DR strategy must assume instance store data doesn’t exist.
Real AWS Interview Questions on Instance Store
These questions come up regularly in AWS Solutions Architect and DevOps Engineer interviews. Practice answering them before your next interview.
Q1: “A customer is running a big data processing job that requires high IOPS for temporary shuffle data. The job can be restarted if it fails. What storage would you recommend and why?”
Strong answer: Instance store is ideal here. The workload is temporary, can tolerate data loss, and benefits from high IOPS. You’d recommend an instance family like i3 or d2 with large instance store volumes. The key justification is that shuffle data is regenerable and performance-critical, making ephemeral storage the right trade-off.
Q2: “What happens to instance store data when you stop an EC2 instance versus when you reboot it?”
Strong answer: Reboot preserves instance store data because the instance stays on the same physical host. Stop/start erases instance store data because AWS may migrate the instance to different hardware, and the physical drives remain with the original host. This distinction is critical for operational procedures.
Q3: “A team reports that their EC2 instance lost data after routine maintenance. The instance was stopped and started. What likely happened?”
Strong answer: The team was likely storing data on instance store without realizing it. When stopped, instance store data is lost. You’d investigate by checking lsblk output, verifying the instance type (looking for “d” suffix), and reviewing what was mounted where. The fix involves either switching to EBS for persistent data or implementing proper backup procedures.
Q4: “How would you design a caching layer using instance store that survives instance failures?”
Strong answer: You wouldn’t rely on a single instance. Design a distributed caching cluster (like Redis Cluster or Memcached) across multiple instances. Each node uses instance store for performance, but data is replicated across nodes. When one instance fails, the cluster continues serving from replicas. The cache can also be populated from a persistent backing store (S3 or RDS) on startup.
Q5: “Your Auto Scaling group uses c5d instances. After a scale-in event, users report missing data. What’s the root cause?”
Strong answer: Scale-in terminates instances, which destroys instance store data. If the application stored user data on instance store, it’s gone when that instance terminates. The fix is redesigning the application to store persistent data on EBS, EFS, S3, or a database — never on instance store in Auto Scaling scenarios.
Frequently Asked Questions (FAQs)
What is EC2 instance store in AWS?
EC2 instance store provides temporary block-level storage that is physically attached to the host computer running your EC2 instance. Unlike EBS volumes which are network-attached and persist independently, instance store volumes are directly connected to the underlying hardware. This architecture delivers extremely high IOPS and low latency but comes with a critical trade-off: data on instance store is ephemeral and will be lost when the instance stops, terminates, or experiences hardware failure.
Does instance store data persist after reboot?
Yes, instance store data survives a reboot. When you reboot an EC2 instance, it remains on the same physical host, so your instance store data stays intact. However, you may need to remount the volume after reboot if you haven’t configured /etc/fstab properly. The critical distinction is between reboot (data persists) and stop/start (data is permanently lost).
What happens to instance store when you stop an EC2 instance?
When you stop an EC2 instance, all data on instance store volumes is permanently erased. AWS releases the underlying host hardware when you stop an instance, and when you start it again, you may be placed on entirely different physical hardware. Your EBS volumes reattach because they’re network-attached storage, but instance store data remains on the old host and is unrecoverable. This behavior catches many engineers by surprise and has caused significant production incidents.
Is EC2 instance store free?
Instance store storage itself doesn’t have a separate charge — it’s included with certain EC2 instance types. However, instance types that include instance store (identified by the “d” suffix like c5d, m5d, r5d) cost more than their non-“d” equivalents. You’re effectively paying for the physical NVMe drives attached to your host. Whether instance store is cost-effective depends on your workload: if you need the performance and can tolerate ephemeral data, it may be cheaper than provisioning high-IOPS EBS volumes on a cheaper instance type.
When should I use instance store instead of EBS?
Use instance store when your workload meets these criteria: the data is temporary or can be regenerated, you need maximum IOPS and minimum latency, you’re running distributed systems with data replication, or you’re processing scratch data that doesn’t need to persist. Common use cases include distributed caching (Redis, Memcached), big data shuffle space (Spark, Hadoop), video transcoding temporary files, CI/CD build artifacts, and database temporary tablespaces. Never use instance store for data you can’t afford to lose.
Can you snapshot an instance store volume?
No, you cannot create snapshots of instance store volumes. Snapshots are an EBS-specific feature. If you need to preserve data from instance store, you must manually copy it to durable storage such as S3, EBS, or EFS before stopping or terminating the instance. This limitation reinforces that instance store is designed for ephemeral, temporary data only.
How do I know if my EC2 instance has instance store?
Check the instance type specification — instance types with instance store typically have a “d” in their name (like c5d.large, m5d.xlarge, i3.large). Once running, SSH into the instance and run lsblk to see all block devices. Instance store volumes appear as unmounted NVMe devices. You can also check the instance metadata by running curl http://169.254.169.254/latest/meta-data/block-device-mapping/ to see all attached devices.
What is the difference between instance store and EBS?
The fundamental difference is attachment method: EBS is network-attached storage that persists independently of your EC2 instance, while instance store is physically attached to the host and is ephemeral. EBS offers durability, snapshots, encryption, and the ability to detach/reattach volumes. Instance store offers dramatically higher IOPS and lower latency but provides no persistence, no snapshots, and data loss on stop/terminate. Choose EBS for persistent data and instance store for high-performance temporary storage.
How fast is instance store compared to EBS?
Instance store typically delivers 5-10x higher IOPS than baseline EBS gp3 volumes. On a c5d.large, expect 20,000+ random write IOPS from instance store compared to ~3,000 IOPS from gp3. Latency is sub-millisecond for instance store versus 1-2ms for EBS. Even provisioned IOPS EBS volumes can’t match instance store latency because EBS still traverses the network. For latency-sensitive, IOPS-intensive workloads, instance store performance is unmatched.
Does instance store work with Auto Scaling?
Instance store works with Auto Scaling, but you must design carefully. When Auto Scaling terminates an instance during scale-in, all instance store data is lost. Never store state or persistent data on instance store in Auto Scaling groups. Use instance store only for temporary, regenerable data like caches or scratch space. Store persistent data on EBS, EFS, S3, or databases that survive instance termination.
Conclusion and Next Steps
You’ve now experienced firsthand what ephemeral storage means in AWS. You launched an instance-store-backed EC2 instance, wrote data, stopped the instance, and confirmed that data loss is real and immediate. The benchmark results showed why engineers still choose instance store despite the risks — that performance advantage is substantial.
The myths we’ve cleared: instance store isn’t “temporary EBS.” It doesn’t have snapshots. It doesn’t persist across stop/start cycles. Rebooting is safe; stopping is not. It’s fundamentally different storage with fundamentally different use cases.
Remember: EC2 instance store is a powerful tool for the right workloads. Distributed caches, shuffle space, temporary processing — these all benefit from local NVMe performance. The key is intentional design. Know what’s ephemeral. Document it. Design for failure.
Next Lab Recommendation
Ready to dive deeper into persistent storage options?
👉 Lab 1.3: EBS Deep Dive — gp3 vs io2, Snapshots, and Performance Tuning {link}
We’ll explore when to choose gp3 baseline performance, when io2 provisioned IOPS makes sense, and how snapshot strategies protect your data.
Related Resources:
- AWS EBS Volume Types Comparison
- EC2 Instance Types with Instance Store
- AWS Instance Store Documentation
- EC2 Instance Types
Don’t forget to terminate your c5d.large instance to avoid ongoing charges.
