
AWS Storage Core Services: Block Storage with Amazon EBS
Master Amazon Elastic Block Store (EBS), AWS's high-performance block storage solution. Understand the characteristics of block storage, various EBS volume types (SSD, HDD), their performance metrics, and how they provide persistent storage for Amazon EC2 instances.
Beyond Objects: Persistent Block Storage with Amazon EBS
Welcome back to Module 11: Storage Core Services! In the previous lessons, we explored Amazon S3, the infinitely scalable object storage. While S3 is ideal for static files, backups, and data lakes, it's not designed for the specific needs of operating systems and applications that require direct, low-latency access to storage at the block level. This brings us to Amazon Elastic Block Store (EBS), AWS's high-performance block storage solution. For the AWS Certified Cloud Practitioner exam, understanding EBS—what block storage is, its characteristics, volume types, and how it integrates with Amazon EC2—is fundamental.
This lesson will extensively cover Amazon EBS, explaining what block storage is, its key characteristics (persistence, high performance, attachment to EC2 instances), the different EBS volume types (e.g., General Purpose SSD, Provisioned IOPS SSD, Throughput Optimized HDD, Cold HDD), and common use cases. We'll include a Mermaid diagram illustrating the crucial relationship between EC2 instances and EBS volumes, solidifying your understanding of this vital component of AWS infrastructure.
1. What is Block Storage?
Block storage is a data storage method that breaks data into uniformly sized blocks, which can then be stored separately on a storage device. Each block has a unique identifier, and the operating system of the server can retrieve and manipulate these blocks as if they were part of a single physical hard drive.
Key Characteristics:
- Raw Disk Access: Provides raw, unformatted storage. The operating system of the instance manages the file system (e.g., ext4, NTFS) on top of the block device.
- Low Latency: Designed for performance-sensitive applications, offering very low latency for read/write operations.
- Persistent Storage: Data stored on block devices persists independently of the compute instance's lifecycle. If the instance is stopped or terminated, the data on the block storage remains unless explicitly deleted.
- Attached to a Single Server: Typically attached to one server at a time, making it ideal for host operating systems and traditional databases that require direct disk access.
- Used by Operating Systems: Operating systems are installed on block storage volumes.
2. Introducing Amazon Elastic Block Store (EBS)
Amazon Elastic Block Store (EBS) provides persistent block storage volumes for use with Amazon EC2 instances. Each EBS volume is automatically replicated within its Availability Zone to protect you from component failure, offering high availability and durability. EBS volumes are designed for workloads that require high throughput and low-latency access to data.
Key Features:
- Persistence: Data on an EBS volume persists independently of the life of the instance.
- Availability Zone Specific: An EBS volume must be in the same Availability Zone as the EC2 instance it's attached to.
- Snapshots: You can take point-in-time snapshots of your EBS volumes, which are stored in Amazon S3. Snapshots can be used for backup, disaster recovery, or to create new EBS volumes.
- Encryption: EBS volumes can be encrypted using AWS KMS. Encryption is transparent to the EC2 instance and applications.
- Elasticity: You can easily change the size and performance characteristics of your EBS volumes on the fly, without detaching them from your instance, usually without downtime.
3. EBS Volume Types: Tailoring Performance and Cost
AWS offers different EBS volume types, each optimized for specific workloads and performance characteristics. Choosing the right volume type is a key aspect of cost and performance optimization.
a. SSD-Backed Volumes (for transactional workloads)
- General Purpose SSD (gp2/gp3):
- Use Cases: Most workloads, including boot volumes, development/test environments, and low-latency interactive applications. It's a good balance of price and performance.
- Performance: Balances price and performance.
gp3offers independent control over IOPS and throughput, and is a cost-effective option.
- Provisioned IOPS SSD (io1/io2/io2 Block Express):
- Use Cases: Critical, performance-intensive applications that require extremely low latency and consistent, high-performance I/O (e.g., large relational databases, transactional workloads).
- Performance: You provision a specific IOPS rate.
io2 Block Expressoffers the highest performance with sub-millisecond latency.
b. HDD-Backed Volumes (for throughput-intensive workloads)
- Throughput Optimized HDD (st1):
- Use Cases: Frequently accessed, throughput-intensive workloads (e.g., big data, data warehouses, log processing).
- Performance: Optimized for large, sequential I/O operations. Cannot be a boot volume.
- Cold HDD (sc1):
- Use Cases: Infrequently accessed data, lowest cost HDD option (e.g., archival, large-volume colder data).
- Performance: Lowest cost and performance HDD option. Cannot be a boot volume.
Visualizing EBS Volume Types and Use Cases
graph TD
A[EBS Volume Types] --> B{SSD Backed}
B --> B1[General Purpose SSD - gp2/gp3]
B1 --> B1a[Boot Volumes]
B1 --> B1b[Dev/Test]
B1 --> B1c[Low-Latency Apps]
B --> B2[Provisioned IOPS SSD - io1/io2]
B2 --> B2a[Critical Databases]
B2 --> B2b[High-Performance Transactional Workloads]
A --> C{HDD Backed}
C --> C1[Throughput Optimized HDD - st1]
C1 --> C1a[Big Data Analytics]
C1 --> C1b[Log Processing]
C --> C2[Cold HDD - sc1]
C2 --> C2a[Archival Data]
C2 --> C2b[Infrequently Accessed Colder Data]
This diagram illustrates the primary EBS volume types and their optimal use cases based on performance and cost requirements.
4. Relationship between EC2 Instances and EBS Volumes
EBS volumes are closely tied to EC2 instances.
- Attachment: An EBS volume can be attached to only one EC2 instance at a time (with the exception of
Multi-Attach EBSforio1/io2volumes, which is an advanced feature). - Availability Zone Restriction: An EBS volume can only be attached to an EC2 instance that is in the same Availability Zone. This is a critical point for the exam. If your EC2 instance is in
us-east-1a, its EBS volume must also be inus-east-1a. - Root Volume: Every EC2 instance requires an EBS volume (or instance store) as its root device volume, where the operating system is installed.
- Deletion Behavior: By default, when an EC2 instance is terminated, its root EBS volume is deleted. Any additional EBS volumes attached to the instance are not deleted by default, but this behavior can be configured.
Visualizing EC2 Instance and EBS Volume Relationship
graph TD
subgraph "Availability Zone 1 (us-east-1a)"
EC2Instance[EC2 Instance]
EBSVolume[EBS Volume]
end
EC2Instance -- Attached to --> EBSVolume
subgraph "Availability Zone 2 (us-east-1b)"
EC2Instance2[EC2 Instance 2]
EBSVolume2[EBS Volume 2]
end
EC2Instance2 -- Attached to --> EBSVolume2
EC2Instance -.- NoDirectAttach[Cannot directly attach to] -.- EBSVolume2
This diagram illustrates that an EC2 instance and its attached EBS volume must reside within the same Availability Zone.
5. EBS Snapshots: Backups and DR
EBS Snapshots are point-in-time backups of your EBS volumes. Snapshots are incremental backups, meaning only the blocks that have changed since the last snapshot are saved, which makes them cost-effective.
- Stored in S3: Snapshots are stored in Amazon S3, though you don't interact with them directly via S3 APIs.
- Cross-AZ/Region: Snapshots can be used to create new EBS volumes in the same or different Availability Zones and even copy them to other AWS Regions. This is a critical mechanism for disaster recovery.
- Encryption: Snapshots of encrypted volumes are automatically encrypted. You can also encrypt unencrypted volumes during the snapshot restore process.
6. Common Use Cases for Amazon EBS
- Boot Volumes: The primary storage for the operating system of EC2 instances.
- Transactional Workloads: Databases (relational and NoSQL) that require persistent, low-latency block storage.
- Throughput-Intensive Applications: Big data and data warehousing applications.
- Development and Test: Persistent storage for development environments.
- Backup and Disaster Recovery: Using snapshots to back up data and restore volumes.
7. Practical Example: Creating and Attaching an EBS Volume
This AWS CLI example demonstrates how to create an EBS volume and attach it to an EC2 instance.
# PART 1: Find an existing EC2 instance ID and its Availability Zone
# Replace 'MyRunningInstance' with a tag name for your EC2 instance.
# Ensure the instance is in a 'running' state.
INSTANCE_ID=$(aws ec2 describe-instances \
--filters "Name=tag:Name,Values=MyRunningInstance" "Name=instance-state-name,Values=running" \
--query 'Reservations[0].Instances[0].InstanceId' --output text)
INSTANCE_AZ=$(aws ec2 describe-instances \
--instance-ids $INSTANCE_ID \
--query 'Reservations[0].Instances[0].Placement.AvailabilityZone' --output text)
echo "Found Instance ID: $INSTANCE_ID in AZ: $INSTANCE_AZ"
# 2. Create an EBS volume in the SAME Availability Zone
VOLUME_ID=$(aws ec2 create-volume \
--volume-type gp3 \
--size 10 \
--availability-zone $INSTANCE_AZ \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=MyEBSVolume}]' \
--query 'VolumeId' --output text)
echo "Created EBS Volume ID: $VOLUME_ID"
# 3. Attach the EBS volume to the EC2 instance
aws ec2 attach-volume \
--volume-id $VOLUME_ID \
--instance-id $INSTANCE_ID \
--device /dev/sdf
echo "Attached EBS Volume $VOLUME_ID to Instance $INSTANCE_ID"
# To verify, you would SSH into the EC2 instance and run 'lsblk' or 'sudo fdisk -l'
# to see the newly attached (but unformatted) disk.
# Clean up (conceptual steps)
# aws ec2 detach-volume --volume-id $VOLUME_ID --instance-id $INSTANCE_ID
# aws ec2 delete-volume --volume-id $VOLUME_ID
Explanation:
aws ec2 describe-instances: Used to find a running EC2 instance and its Availability Zone. This is crucial because an EBS volume must be in the same AZ as the EC2 instance it attaches to.aws ec2 create-volume: Creates a new EBS volume. We specifygp3(General Purpose SSD), a size of10GB, and critically, theavailability-zonemust match the instance.aws ec2 attach-volume: Attaches the newly created EBS volume to the target EC2 instance, specifying a device name (e.g.,/dev/sdf).
This code snippet highlights the block-level persistence and the AZ restriction, which are key characteristics of EBS.
Conclusion: Persistent Storage for Your EC2 Instances
Amazon Elastic Block Store (EBS) is the indispensable block storage service for Amazon EC2 instances, providing persistent, high-performance storage that behaves like a traditional hard drive. Understanding what block storage is, the various EBS volume types, their performance characteristics, and the tight coupling with EC2 instances is critical for the AWS Certified Cloud Practitioner exam. By leveraging EBS, you can ensure your applications have the necessary durable and performant storage, along with robust backup and disaster recovery capabilities through EBS Snapshots.
Knowledge Check
?Knowledge Check
A developer needs to ensure that the data on their Amazon EC2 instance persists even if the instance is stopped or terminated. The application running on the EC2 instance requires direct, low-latency access to the storage. Which AWS storage service should be used?