When we develop and deliver software, we expect that software to operate under a certain set of conditions and use a certain amount of compute resources.

When a performance or reliability problem surfaces, it’s often due to a disagreement between reality and our expectations for the resources an application process needs. The application may need more memory, cpu, or i/o resources than we expected.

Resource usage expectations often start off as rough or implicit, “we certainly think the app should use less than 4G of RAM because that’s what our test machines have.”

You may refine those expectations by load testing with representative or canary workloads and capturing comprehensive monitoring data. “Oh, looks like this JVM with the 3G heap actually uses 4.25G of RAM once we’ve run a bunch of customer requests through it.”

Once you have reasonable expectations for the application’s resource requirements, you should limit the application to using just those resources. This will help you contain the impact of runaway resource consumption due to functional bugs, security breaches, and mis-modeled algorithms. These limits can preserve resources for other applications and system operations running on the host. Here’s how this works with Docker.

Docker permits you to limit the compute resources used by containers, including: memory, cpu, and both network and storage input/output. This is accomplished using cgroups.

Every Docker container gets its own Linux Control Group (cgroup) by default. Cgroups are a Linux kernel feature that:

  • accounts for cpu, memory, I/O and other resources used within a container
  • optionally enforces limits for the use of those resources, e.g. denying further memory allocation or throttling cpu usage

Most people don’t need to know much more about cgroups other than that you can trust it.

Docker does not require containers to specify resource limits by default. In many ways this is helpful to get people started running applications in containers. This is the way applications run outside of containers and many people don’t actually know how much memory or cpu resources their applications actually consume. However, it also means that a container run without limits may use all the resources available on the host.

This is a recipe for disaster in production on a shared compute platform.

Enforcing resource limits is a critical best practice for running workloads on shared compute platforms such as a container cluster orchestrated by ECS, Kubernetes, or Swarm. There are several reasons for this; the limits:

  1. prevent applications from consuming more than their expected and fair share of resources on the host
  2. provide the container orchestrator critical information needed to schedule the container on a host in the cluster with sufficient compute resources
  3. provide autoscaling controllers critical information needed to add and remove instances of a containerized service based on resource usage

Next, we’ll examine configuring a container memory resource limit and how that plays out on a container cluster.

Update – Learn how to configure: