Approximate Reading Time: 3 minutes

Previously, we worked through how ‘infrastructure as code’ tools convert desired state into actual resources, in general. IaC tools construct a model of the desired state of your resources, merge information about what already exists into that model, and then converges the actual to desired state. Let’s dig into a bit of how Terraform builds that model.

First, this bit of Terraform code declares that an AWS EC2 instance should be created:

resource "aws_instance" "app" {
  count         = "1"
  instance_type = "t3.micro"
  ami = "${data.aws_ami.amazon_ecs_linux.id}"
  user_data = "${data.template_file.init.rendered}"
  # The name of our SSH keypair we created above.
  key_name  = "${aws_key_pair.exercise.id}"
  associate_public_ip_address = "true"
  vpc_security_group_ids = [
    "${aws_security_group.public_ssh.id}",
    "${aws_security_group.internal_web.id}",
    "${aws_security_group.outbound.id}",
  ]
  availability_zone = "${var.availability_zones[count.index]}"
  tags = "${merge(local.base_tags
                  , map("Name", "${local.exercise_app_name}-${count.index}")
                  , map("WorkloadType", "Pet")
                  )}"
}

In Terraform a resource models an infrastructure object managed or available via and API. Resources include virtual machines, load balancers, key pairs, and (many more).

The interpolated expressions such as "{data.aws_ami.amazon_ecs_linux.id}" are the entrypoint to Terraform’s magic. Terraform’s expression syntax lets you compute or refer to values of other resources attributes in a Terraform configuration.

The module this code is excerpted from uses the Terraform Amazon Machine Image data source to query for the most recent ECS AMI:

data "aws_ami" "amazon_ecs_linux" {
  most_recent = true
  owners = ["amazon"]
  filter {
    name = "name"
    values = [
      "amzn-ami-*.i-amazon-ecs-optimized",
    ]
  }
}

Then that AMI id is referenced within the declaration of the aws_instance resource with:

resource "aws_instance" "app" {
  ami = "${data.aws_ami.amazon_ecs_linux.id}"
  # ... snip ...
}

No hardcoding required.

MAGIC!

This aws_instance declaration also has references to an ssh keypair, security groups (firewalls), and reference data for availability zones and tags. These resources and data are defined elsewhere in the module.

Identifying and tracking resources

Terraform uses these resources, data sources, and references declarations to build the model of your resources in the form of a directed, acyclic graph.

Each resource or data source managed by Terraform is described by its type and a name for the instantiation of that resource. These names take the form of <RESOURCE TYPE>.<NAME> and are an essential part of the declaration.  These become the nodes in the state dependency graph.

For example, the EC2 instance resource described above:

resource "aws_instance" "app"

has a resource type of aws_instance and a name of app.

The ami data source has a type of aws_ami and a name of amazon_ecs_linux.

When one resource definition uses an attribute of another resource or data source in Terraform, the tool establishes and tracks a dependency relationship on your behalf.  These become the edges in the state dependency graph.

Logically, this looks like:

Logical view of the app instance’s dependency graph

In this case, Terraform tracks that the aws_instance.app resource(s) depends on:

  • an AMI id provided by an aws_ami data source
  • user_data provided by a template_file data source
  • keypair defined by an aws_key_pair resource
  • security groups defined by several aws_security_group resources
  • availability zones provided by a variable
  • tags provided by a local computation

One of the really neat things about Terraform is that it tracks the value that each of these attribute references resolves to.

When Amazon publishes a new ECS AMI, the value returned from the Amazon EC2 AMI api will change, e.g. from ami-something-old-12ab to ami-something-new-34cd. Terraform sees this change and knows the EC2 instances need to be recreated. Each Terraform resource type know which changes to its inputs require recycling and which do not.

You can use the terraform graph command to to output the relationships between managed entities.  The graph command produces the ‘dot’ format and can be visualized that with the graphviz family of tools.  Here is an excerpt of the actual Terraform dependency graph, focused in on the aws_instance.app resource:

Excerpt of graph output highlighting the app instance

If you’re interested in understanding more about why and how Terraform tracks these relationships, check out ‘ Applying Graph Theory to Infrastructure as Code‘ by Paul Hinze, Hashicorp engineer.

#NoDrama