Format EBS disk with User Data
1 September 2023
Additional block volumes in AWS EC2 are not deleted by default (although root ones can also be marked to be retained). It is a good practice to keep data that we want to keep on a separate volume, such as MySQL database files. However, when creating things with IaC solutions such as Terraform, the disks are empty. Usual practice is to SSH into them and create new partitions. But we will automate this in user data of the EC2 instance.
The repository for this post is available here
Base infrastructure
Let's define some basic infrastructure. As this is only a demonstration, we will break all the rules of good taste and put our instance in public subnet, in default VPC with public IP and security group allowing SSH from anywhere.
data "aws_ami" "amazonlinux" {
owners = ["amazon"]
most_recent = true
filter {
name = "name"
values = ["al2023-ami-2023.1.*arm64"]
}
}
resource "aws_instance" "amazonlinux" {
instance_type = "t4g.micro"
ami = data.aws_ami.amazonlinux.id
key_name = aws_key_pair.kp.id
associate_public_ip_address = true
availability_zone = "eu-central-1a"
vpc_security_group_ids = [aws_security_group.ssh.id]
}
resource "aws_key_pair" "kp" {
key_name = "kp"
public_key = file("~/.ssh/id_rsa.pub")
}
data "aws_vpc" "default" { default = true }
resource "aws_security_group" "ssh" {
name = "sshsg"
vpc_id = data.aws_vpc.default.id
ingress {
description = "SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Checking how volumes are marked on our OS
In order to automate the process, we first need to know what to expect - namely,
how the disks are named in the Linux filesystem. Different systems might use
different naming schemes. Both Amazon Linux 2023 and Ubuntu 22.04 on t4g
instances report /dev/nvmeXnY
for both gp3
and st1
EBS volumes. However,
the same OSes on t2.micro
report /dev/xvdX
where X
is a letter from
device_name
property during attachment. So, the best way is to experiment. For
our purpose, we will run Amazon Linux 2023 on t4g
family of instances.
Let's create and mount some disk:
resource "aws_ebs_volume" "disk" {
type = "gp3"
size = 8
availability_zone = "eu-central-1a"
}
resource "aws_volume_attachment" "disk-mount" {
instance_id = aws_instance.amazonlinux.id
volume_id = aws_ebs_volume.disk.id
device_name = "/dev/sdf"
}
First phase - waiting for the disk to appear
Because EBS is in reality a network attached storage, the device to appear might
be delayed after boot. For that we will wait for some time periodically checking
and sleeping if the file we want to work on exists. Start a new file
user_data.sh
in your current project directory.
#!/bin/bash
# Wait for /dev/nvme1n1 to appear up to 30 seconds
for i in {1..10}; do
[[ -e /dev/nvme1n1 ]] && break
echo "Waiting for nvme1n1 ($i/10)"
sleep 3
done
Checking current state of the disk
By definition, we don't want to format the disk every time. If the disk is
formatted already and contains data, we don't want to lose it but use it. For
that we need to check if the disk even needs formatting. We will use lsblk
to
get the list of file systems on the volume. If the string returned by lsblk
is
empty, that means there are no partitions on the disk.
To format the disk we will use parted
and mkfs
. Also after formatting, it is
a good idea to tell the kernel to reload the partition table with partprobe
and udevadm settle
.
if [[ -e /dev/nvme1n1 ]]; then
# Determine the file system
FSTYPE=$(lsblk /dev/nvme1n1 -no fstype)
if [[ -z "$FSTYPE" ]]; then
echo "Formatting /dev/nvme1n1"
parted -s /dev/nvme1n1 mklabel gpt
parted -s /dev/nvme1n1 mkpart primary ext4 0% 100%
mkfs.ext4 /dev/nvme1n1p1
# Reload partitions
partprobe /dev/nvme1n1
udevadm settle
else
echo "Disk is already formatted"
fi
fi
Adding the disk to be mounted on boot
Another important aspect of automation is to make sure that the disk is mounted
on boot. For that we will add an entry to /etc/fstab
. Again, lsblk
will tell
us the disk UUID which we can use to identify the disk in /etc/fstab
even if
it suddenly changes place in the system (in /dev
).
if [[ -e /dev/nvme1n1 ]]; then
# ... formatting from above
UUID=$(lsblk /dev/nvme1n1p1 -no UUID)
if grep -q "$UUID" /etc/fstab; then
echo "Disk is already in fstab"
else
echo "Adding disk to fstab"
echo "UUID=$UUID /mnt/data ext4 defaults 0 0" >> /etc/fstab
fi
else # if [[ ! -e /dev/nvme1n1 ]]
echo "Disk is not present"
fi
Lastly we can create mount points and mount all the disks.
mkdir -p /mnt/data
mount -a
Testing
Putting it all together, let's edit the instance definition to load user data on boot.
resource "aws_instance" "amazonlinux" {
instance_type = "t4g.micro"
ami = data.aws_ami.amazonlinux.id
key_name = aws_key_pair.kp.id
associate_public_ip_address = true
availability_zone = "eu-central-1a"
vpc_security_group_ids = [aws_security_group.ssh.id]
user_data = file("./user_data.sh")
}
output "public_ip" {
value = aws_instance.amazonlinux.public_ip
}
SSH to the instance and check that the disk is mounted and formatted. Also
review /var/log/cloud-init-output.log
to see what happened.
$ ssh ec2-user@$(terraform output --raw public_ip)
$ # In SSH
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
nvme0n1 259:0 0 8G 0 disk
├─nvme0n1p1 259:1 0 8G 0 part /
└─nvme0n1p128 259:2 0 10M 0 part /boot/efi
nvme1n1 259:3 0 8G 0 disk
└─nvme1n1p1 259:4 0 8G 0 part /mnt/data
$ cat /var/log/cloud-init-output.log
...
Cloud-init v. 22.2.2 running 'modules:final' at Tue, 22 Aug 2023 19:59:49 +0000. Up 9.00 seconds.
Waiting for nvme1n1 (1/10)
Waiting for nvme1n1 (2/10)
Waiting for nvme1n1 (3/10)
Formatting /dev/nvme1n1
mke2fs 1.46.5 (30-Dec-2021)
Creating filesystem with 2096640 4k blocks and 524288 inodes
Filesystem UUID: 6f5a14d0-0334-44fa-af61-fced753ba017
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
Adding disk to fstab
Cloud-init v. 22.2.2 finished at Tue, 22 Aug 2023 19:59:59 +0000. Datasource DataSourceEc2. Up 19.04 seconds
So let's leave some trace behind us to see if the script works as expected and doesn't wipe formatted disks.
$ sudo sh -c 'echo "Hello world $(date)" >> /mnt/data/hello.txt'
$ cat /mnt/data/hello.txt
Hello world Tue Aug 22 20:05:33 UTC 2023
Recreating the instance
Destroy the instance with Terraform by tainting or destroying with target. Remember to not destroy the second disk, so our data is preserved. If only the instance and mount point are destroyed, you are on a good track.
$ terraform destroy -target aws_instance.amazonlinux
# aws_instance.amazonlinux will be destroyed
# aws_volume_attachment.disk-mount will be destroyed
Plan: 0 to add, 0 to change, 2 to destroy.
$ terraform apply
$ ssh ec2-user@$(terraform output --raw public_ip)
Now let's read again the cloud init log and check if the file still exists.
# In SSH
$ cat /var/log/cloud-init-output.log
...
Cloud-init v. 22.2.2 running 'modules:config' at Tue, 22 Aug 2023 20:12:27 +0000. Up 7.96 seconds.
Cloud-init v. 22.2.2 running 'modules:final' at Tue, 22 Aug 2023 20:12:28 +0000. Up 8.64 seconds.
Waiting for nvme1n1 (1/10)
Waiting for nvme1n1 (2/10)
Waiting for nvme1n1 (3/10)
Disk is already formatted
Adding disk to fstab
Cloud-init v. 22.2.2 finished at Tue, 22 Aug 2023 20:12:37 +0000. Datasource DataSourceEc2. Up 18.07 seconds
$ cat /mnt/data/hello.txt
Hello world Tue Aug 22 20:05:33 UTC 2023
It works!