Building and Running Custom AMIs on AWS Using Packer and Terraform

  • September 11, 2019

Introduction

In this article we are going to talk about two open-source infrastructure-as-code tools that we use at Flugel. These tools are Packer, to build machine images for different platforms, and Terraform, to manage infrastructure resources.

 

By using the two in combination  it’s possible to create infrastructure-as-code solutions that automatically build and run custom machine images, provisioning an EC2 instance on AWS using a custom AMI, for example.

 

For this article we’ll examine one particular use case: we’ll provision an EC2 instance that allows us to benchmark an HTTP endpoint. To accomplish this, we’ll first use Packer to create an AMI with the HTTP test and benchmarking tool siege installed. Then, using Terraform,  we’ll provision an EC2 instance using this AMI.

 

You can find the source code for the examples used in this article at: https://github.com/flugel-it/packer-terraform-aws-article  

 

Building an AMI with Packer

Packer is an open-source tool by Hashicorp that automates the creation of machine images for different platforms. Developers specify the machine configuration using a JSON file called template, and then run Packer to build the image.

 

One key feature of Packer is its capability to create images targeted to different platforms, all from the same specification. This is a nice feature that allows you to create machine images of different types without repetitive coding.

 

You can get Packer and its documentation at Packer Official Site.  

 

Installing Packer

There are various options for installing Packer depending on your platform and preferences. Go to Packer’s Getting Started page for detailed instructions. Keep in mind that using the precompiled binary is the simplest option.  

 

The Template File

As we’ve said, templates are JSON files that define the steps required to build a machine image. Packer uses the information specified in the template to create the images.

 

The components most often configured through the template files are the builders and the provisioners.

 

Builders are components that are able to create a machine image for a single platform. Each template can define multiple builders to target different platforms. There are plenty of builders from which to choose. In this article we use the AWS AMI builder to create an AWS AMI.

 

Provisioners are components of Packer that install and configure software within a running machine prior to creating a static image from that machine. Packer has many provisioners you can use, such as file, to copy files to the machine, shell, to execute commands in the machine, and many more.

 

Let’s introduce the following template file and then analyse it. We are going to use this template to build an AWS AMI that contains the siege HTTP benchmarking tool, using Ubuntu 18.04 as the base image.


{
   "variables": {
       "ami_name_prefix": "{{env `AMI_NAME_PREFIX`}}"
   },
   "builders": [{
       "ami_description": "An AMI with HTTP benchmarking tools, based on Ubuntu.",
       "ami_name": "{{user `ami_name_prefix`}}-{{isotime | clean_resource_name}}",
       "instance_type": "t2.micro",
       "region": "{{user `region`}}",
       "source_ami_filter": {
           "filters": {
               "architecture": "x86_64",
               "block-device-mapping.volume-type": "gp2",
               "name": "*ubuntu-bionic-18.04-amd64-server-*",
               "root-device-type": "ebs",
               "virtualization-type": "hvm"
           },
           "most_recent": true,
           "owners": [
               "099720109477"
           ]
       },
       "ssh_username": "ubuntu",
       "type": "amazon-ebs"
   }],
   "provisioners": [{
           "inline": [
               "echo 'Sleeping for 30 seconds to give Ubuntu enough time to initialize (otherwise, packages may fail to install).'",
               "sleep 30",
               "sudo apt-get update",
               "sudo apt-get dist-upgrade -y"
           ],
           "type": "shell"
       },
       {
           "type": "file",
           "source": "{{template_dir}}/urls.txt",
           "destination": "/home/ubuntu/urls.txt"
       },
       {
           "scripts": [
               "{{template_dir}}/install-tools.sh"
           ],
           "type": "shell"
       }
   ]
}

 

The above template contains three main sections: variables, builders and provisioners.

 

Variables Section

In order to avoid hardcoded values, in the variables section you can define variables that are to be used in other template sections. Users can override variables’ values, passing them as options to Packer when building the images. This allows users to customize the building process without changing the template file.

 

In our template we defined just one variable ami_name_prefix that will be the prefix of the AMI name. We used the env function in the variable definition to set its default value to the environment variable AMI_NAME_PREFIX.

 

Builders Section

In the builders section we specified one builder of type amazon-ebs. This builder creates an AMI backed by an EBS volume. These are the steps run by this builder:

  1. Create a new EC2 instance from the base AMI.
  2. Run the provisioners on the EC2 instance.
  3. Create the new AMI from the running EC2 instance.
  4. Destroy the EC2 instance.

 

Different builders have different parameters. The amazon-ebs builder requires that we specify:

  • ami_name: The name of the resulting AMI.
  • ami_description: A description for the AMI.
  • source_ami_filter: How to find the base AMI.
  • instance_type: Instance type to use in the building process, can be different from the type you plan to use when running the AMI.
  • region: Name of AWS region in which the AMI must be generated.
  • ssh_username: SSH username to connect to the building instance.

 

Provisioners Section

Finally, the provisioners section contains the steps needed to install the needed packages and files in the base AMI to transform it into our custom one.

 

The above template specifies these steps:

  1. Upgrade Ubuntu packages.
  2. Copy urls.txt that contains a list of urls we can feed to siege.
  3. Run install-tools.sh script to install siege and other packages.

 

Both urls.txt and install-tools.sh are inside the same directory as the template, so we can use  template_dir Packer function when passing their paths to the provisioners.

 

The urls.txt simply lists the URLs to be fed to siege to benchmark these URLs:

https://example.com/
https://example.com/categories
https://example.com/login
https://example.com/productA
https://example.com/productB
https://example.com/productC

 

install-tools.sh installs bash-completion and siege packages using APT.


#/!bin/bash
 
set -e
 
sudo apt-get install -y bash-completion siege

 

Building the AMI

Prior to building the AMI it’s a good idea to validate syntactical correctness of the template file. We can do this using Packer’s validate command. Validate verifies that the template is a valid JSON file and satisfies Packer’s template schema.


$ packer validate template.json
Template validated successfully.

 

“Template validated successfully” is the expected output if the template is valid.

 

Before building the AMI you have to configure your AWS credentials. You can use aws-cli configure command or export your credentials in environment variables. You can learn about these options at https://aws.amazon.com/cli/ and https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html.

 

With your AWS credentials configured and the template validated, you are ready to use Packer to build the AMI. Run the following command to start building the AMI.

 

$ packer build -var 'ami_name_prefix=http-benchmarking' template.json

 

Note that we passed http-benchmarking as the AMI prefix name. The full AMI name, as defined in the template, will be built using this prefix and the current date.

 

The process can take several minutes. After building the AMI, Packer shows a message containing the AMI id:

 

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:
us-west-2: ami-04236c7ea2f337c88

 

You can locate and run the AMI from the AWS console under the EC2 service. In the next section we’re going to use Terraform to provision an EC2 using the AMI we’ve just created.

 

Provisioning an Instance using Terraform

Terraform is another open-source tool by Hashicorp. It provides a domain-specific language that enables developers to manage infrastructure resources using declarative configuration files.

 

Developers use the Terraform language to configure the infrastructure by specifying resources and their desired state. Then Terraform uses this information to make the API calls to create or update the infrastructure. To make the API calls Terraform uses components called providers that are responsible for the interaction with external APIs. Terraform already includes providers for most cloud platforms and IaaS platforms.

 

The main constructs of the Terraform language used by developers are:

  • resource: To specify an infrastructure element and its configuration, an EC2 instance, for example.
  • provider: To configure provider parameters, such as version, region and credentials.
  • data: To fetch some information from a provider, for example to find an AMI id by its name.
  • variables: Like Packer variables, used to avoid repeating code or hardcoding values. Users can override variables’ values by passing them as options to Terraform.
  • output: Used to expose information about the resources managed by Terraform to external systems, for example to expose the IP address of an EC2 instance managed by Terraform.

 

After each run Terraform saves the current infrastructure state in a state file. This state allows Terraform to map real-world resources to the resources specified in the configuration. This information is used in successive Terraform runs to detect drifts between the actual infrastructure and the configuration.

 

By default, the Terraform state is stored locally, but Terraform supports different options to store it remotely, which is the best approach when working in a team.

 

Configuration to provision an instance using our custom AMI

In this section we are going to review a Terraform configuration that provisions an EC2 instance using the AMI we built using Packer. This configuration also allows connections to port 22 to allow SSH connections.

 

Terraform uses all .tf files in the directory as the infrastructure configuration. This convenience allows you to organize the configuration code into multiple files, depending on your needs.

 

For our purpose we have six .tf files:

terraform/
├── variables.tf
├── main.tf
├── ami.tf
├── security.tf
├── instace.tf
└── outputs.tf

 

The variables.tf file

This file defines infrastructure parameters which should be easily overridden by the user. These are: how to find the AMI by its name and owner, what instance type to use, keypair, CIDR from which to allow SSH connections, and a tag to be applied to all the resources to make them easy to spot in the AWS console.

 

The following code defines the variables:


variable "ami_name_filter" {
   description = "Filter to use to find the AMI by name"
   default = "http-benchmarking-*"
}
 
variable "ami_owner" {
   description = "Filter for the AMI owner"
   default = "self"
}
variable "instance_type" {
   description = "Type of EC2 instance"
   default = "t2.micro"
}
 
variable "keypair" {
   description = "Key pair to access the EC2 instance"
   default = "default"
}
 
variable "allow_ssh_from_cidrs" {
   description = "List of CIDRs allowed to connect to SSH"
   default = ["0.0.0.0/0"]
}
 
variable "tag_name" {
   description = "Value of the tags Name to apply to all resources"
   default = "http-benchmarking"
}

 

Here we have specified default values; later we’ll see how to override them.

 

The main.tf file

Despite the name main there is nothing special about this file. We use this file to initialize the AWS provider and declare a local value common_tags, to avoid repeating the expression.


This is main.tf code:


provider "aws" {
 version = "~> 2.0"
}
 
locals {
 common_tags = {
   Name = "${var.tag_name}"
 }
}

 

The ami.tf file

This file uses a Terraform data element to fetch AMI information from the AWS provider. We use it since we need the AMI id to create the instance, but we have only the AMI name and owner.


In the following code we created an aws_ami data element and named it ami.


data "aws_ami" "ami" {
 most_recent      = true
 owners           = ["${var.ami_owner}"]
 
 filter {
   name   = "name"
   values = ["${var.ami_name_filter}*"]
 }
 filter {
   name   = "root-device-type"
   values = ["ebs"]
 }
 filter {
   name   = "virtualization-type"
   values = ["hvm"]
 }
}

 

Here, the AWS provider fetches the information for the AMI matching the specified filters.

 

The security.tf file

In this file we defined a Security Group resource in AWS that allows incoming connections to port 22 from the CIDR blocks specified in the variable allow_ssh_from_cidrs.


resource "aws_security_group" "sg" {
 ingress {
   cidr_blocks = "${var.allow_ssh_from_cidrs}"
 
   from_port = 22
   to_port   = 22
   protocol  = "tcp"
 }
 
 egress {
   from_port   = 0
   to_port     = 0
   protocol    = "-1"
   cidr_blocks = ["0.0.0.0/0"]
 }
 
 tags = "${local.common_tags}"
}

 

Here the egress parameter allows outbound connections to anywhere from within the security group.

 

The instance.tf file

This file defines the EC2 instance resource and its properties, such as AMI, instance type, security group, key name, and tags.


resource "aws_instance" "instance" {
 ami             = "${data.aws_ami.ami.id}"
 instance_type   = "${var.instance_type}"
 security_groups = ["${aws_security_group.sg.name}"]
 key_name = "${var.keypair}"
 tags = "${local.common_tags}"
}

 

Note that we specified the AMI id fetched in ami.tf using the data element. We also used the autogenerated security group name for the security group defined in security.tf.

 

The outputs.tf file

Finally, we declared two outputs to make the instance IP address and ID available outside Terraform.


output "ip" {
 value = "${aws_instance.instance.public_ip}"
}
output "ec2instance" {
 value = "${aws_instance.instance.id}"
}

 

Provisioning the infrastructure

In this section we are going to see how to override variables’ values, use Terraform to validate the configuration, check differences between actual resources and the configuration, and apply the configuration to provision the infrastructure.

 

First, we are going to use a .tfvars file to specify the value of each variable that we want to override. Naming this file terraform.tfvars causes Terraform to load it automatically.

ami_name_filter = "http-benchmarking-*"
instance_type = "t2.medium"
keypair = "mykeypair"
allow_ssh_from_cidrs = ["190.18.32.44/32"]
tag_name = "MyTerraformProvisionedInfra"

 

We have set the filter for the AMI name, instance type, keypair and CIDR to allow SSH connections only from our computer, and tag_name to identify our resources.

 

Keep in mind that the keypair has to exist in the same AWS region where you are creating the instance.

 

Now we can use Terraform’s validate command to verify that the configuration is valid:

 

$ terraform validate
Success! The configuration is valid.

 

Next, using Terraform’s plan command we can look at the changes that Terraform is going go make to the infrastructure. The output is like a “diff” between the configuration and actual state.

 

To actually make the changes to the infrastructure, you need to run Terraform’s apply command. It will show you the changes again and ask for confirmation.

 

When Terraform finishes making changes it will show the values of the outputs that we defined in the configuration. It will look like this:

Outputs:
 
ec2instance = i-00ba4d7f559e0cdfc
ip = 35.166.54.104

Now the instance is up and running; we can connect to it using SSH.

 

Connecting to the instance

With both the IP address displayed in the Terraform outputs and your key pair, you can connect to the instance using SSH.

 

$ ssh -i mykeypair.pem ubuntu@35.166.54.104

 

Here we have connected with  key pair mykeypair.pem and instance IP address 35.166.54.104.

 

Once connected to the instance you can execute any command on it. Running ls we can verify that the urls.txt file was copied during the AMI build using Packer.

 

ubuntu@ip-172-31-43-205:~$ ls
README  urls.txt

 

Finally we can use the instance to run the HTTP benchmarks using siege and the urls.txt file:

ubuntu@ip-172-31-43-205:~$ siege -f urls.txt

 

We have built the infrastructure that allows us to run these benchmarks independently of our office/home connection speed.

 

One great benefit of doing it using infrastructure-as-code tools is that we can automate the process completely and repeat it if we need to run the benchmarks again.

 

Freeing up resources

When not using the instance, you can use Terraform’s destroy command to free all the resources it has created. This is useful to avoid wasting money in AWS.

 

$ terraform destroy

 

Terraform will show you the changes it’s going to apply and ask for confirmation.


Since terraform did not create the AMI it won’t delete it when you run the destroy command. So, to delete all the resources, you need to delete the AMI from the AWS console.

 

Wrapping up

We used HTTP benchmarking as our example for this article, but most of the steps we’ve gone through can be applied to other use cases.

 

We’ve seen how, by combining Packer and Terraform, it’s possible to create infrastructure-as-code solutions to build and provision custom machines. These tools provide a layer of abstraction over cloud platforms and IaaS APIs, making developers’ work more efficient.

Having the infrastructure defined as code has many advantages; you can:

  • Reuse code
  • Automate provision of your infrastructure in your CI/CD pipelines.
  • Track changes using common versioning tools, such as git.
  • Destroy and re-create your infrastructure with confidence.
  • Deploy multiple times from the same definitions.

 

On a final note, this article describes just some of Packer’s and Terraform’s features. Check their official documentation to learn more about them and how they can be handy when implementing infrastructure-as-code solutions.