Amazon Elastic Container Service (ECS) is a container orchestrator built and operated by AWS. AWS operates ECS’ central orchestrator components and manages cluster state for you.

Reading Time: 4 minutes

Amazon ECS Ecosystem Diagram
Amazon ECS Ecosystem Diagram

You have the choice to deploy containers on worker hosts you operate yourself on EC2 or let AWS take care of that for you too, using Fargate. Fargate is a service that lets you run containers on demand, paying only for the resources used by the containers on a fleet of container hosts that AWS manages for you within your account’s security boundary.

ECS may be the simplest container orchestrator adopt and use in AWS, depending on how much you need to customize on the container worker hosts. Teams that are committed to the AWS platform and need to run containerized applications should seriously consider ECS. ECS is:

  • well-integrated with other AWS services, particularly Security, Networking, and Load Balancing
  • supported by tools from AWS and third parties, e.g. Terraform
  • supported by AWS with the level you’d expect from your support agreement

Teams can create container clusters in a number of ways:

  • AWS Console
  • AWS command-line tools: aws ecs and ecs-cli
  • Infrastructure as Code tools: CloudFormation, Terraform, AWS CDK, and more

When you run the container worker hosts yourself on EC2, you can use the default Amazon ECS-optimized AMI or provide your own. Using your own AMI lets you provide your own specific security and networking configurations and 3rd-party logging and monitoring components. Configuring 3rd-party logging and monitoring components is a very common choice. Many teams find CloudWatch Logs’ usability lacking and CloudWatch Metrics may be cost-prohibitive for custom application metrics.

ECS has robust and comprehensive modeling of service orchestration concepts such as Services, Tasks, Load Balancers, Identity, and Config+Secrets. The central ECS concept is the Task, which defines a set of containers that define a deployment unit. ECS tasks are similar to a Kubernetes pod and support sidecar containers. Tasks instances are run on a cluster’s container hosts and joined into a TargetGroup when healthy. A TargetGroup backs an ECS Service and is usually the backend for one or more Load Balancer endpoints.

These concepts are implemented with the canonical AWS services, where applicable, for example:

  • HTTPS/HTTP load balancing with Application Load Balancer
  • Identity and Access control using task and container-instance specific IAM roles
  • task configuration and secrets provided by SSM Parameter Store, with access control enforced by IAM

ECS clusters can integrate with your existing network infrastructure and operate alongside each other without lots of prior planning (or rearchitecting).

A huge amount of native AWS functionality is available for use in ECS, but it can be a struggle to understand the order in which to create resources. From the ECS ‘Create Application Load Balancer‘ guide:

After your load balancer and target group are created, you can specify the target group in a service definition when you create a service. When each task for your service is started, the container and port combination specified in the service definition is registered with your target group and traffic is routed from the load balancer to that container. For more information, see Creating a Service.

Note that a Target Group is essentially a set of ip:port locations that a load balancer can send traffic to, the IP may belong to a container host or a network interface that has been attached to the container or container host (whew). Once you understand how AWS has deconstructed networks and provides means to compose them back together, it’s cool, but the journey can be frustrating.

Differences

Here are three key differences between ECS and most other orchestrators.

First, ECS has been designed to work well with your existing AWS infrastructure and other AWS services. Highly available, scalable, performant, cost-effective, and (even) secure container cluster deployments are available without:

  • rearchitecting your entire network layer
  • pushing other solutions out
  • trying to leverage AWS-specific features (you’re likely already using) through a Cloud abstraction layer

These are some of the costs avoided when adopting a Cloud-specific solution over a ‘portable’ one.

ECS has really strong integration with core AWS services: IAM, VPC, ALB/NLB, Security Groups, Auto scaling, SSM Parameter Store. ECS also has strong support from complementary container-specific services such as the highly-available image registry service (ECR) and Fargate if you prefer to let AWS run your container worker fleet in addition to the control plane.

Second, the cluster control plane and cluster state is built and operated by AWS at no charge. Further, AWS has improved the ECS control plane and enabled cluster operators to upgrade with minimal pain. Updating the ECS agent is usually a non-event and the supported agent versions extends back a year so there’s plenty of time to do it..

Third, ECS and integrated AWS services are all supported by AWS.

Best For

I think some of the best ways to use ECS are:

  • As the standard container orchestration technology in an organization committed to AWS
  • When you need maximum Security and Efficiency advantage from collaborating AWS security, networking, and auto scaling services
  • When you want teams or departments to run their own container clusters, as opposed to managing those centrally with a highly skilled team; note it can still be helpful to collaborate on a shared ECS cluster and container host configuration within an organization.
  • As a batch job processing platform in combination with AWS Batch
  • As a ‘serverless’ containerized compute platform using Fargate to run workloads ill-suited for Lambda
  • Continue using it if you’re already using it and happy (I’d love to hear if you aren’t happy)

Closing Words

A lot of folks are using ECS and running the business with very little drama. However, conversations with users sometimes start in a whisper:

Team Lead: Yeah, so we’re using ECS… It’s just temporary, you know, but we’re running on AWS and haven’t had time to look deeply into Kubernetes.

Me: How is ECS working for you?

Team Lead: Oh, overall it’s working fine. We’ve got tens of services deployed to production. Though we’re having trouble with X …

Me: Ok, sure. I know what you mean. This is kind of an open problem no matter what orchestrator you use. In AWS, the usual solution involves writing a Lambda to do Y or Z. Have you considered something along those lines?

Team Lead: Yes, that’s the direction we were thinking of going.

Don’t be embarrassed to use tools that meet your delivery throughput, change latency, deployment success, and operational excellence goals. To me, this exchange looks like the effects of the Kubernetes marketing machine making people feel like they are missing out. Yes, people are missing out on some things, but whether those things matter to the business or delivery teams are another. Also, recognize that you’ll likely need to write small tools and even components that plug into the orchestration system once you start operating any cluster towards high scale or high (~80%) utilization.

The developer UX for ECS has improved a lot over the past two years. The tooling used to feel incomplete to me, seemingly requiring multiple tools for common use cases. However, that’s changed and one of the things that can make using ECS better is to embrace it directly. For example, try configuring ECS-related resources directly using Terraform or the AWS CDK instead of relying on ecs-cli to translate and apply a Docker Compose file.

ECS is a first class container orchestration platform for teams committed to AWS. If you’re in one of the ‘best for’ categories, give it a serious look.

The next post in this series covers Kubernetes.

Stephen

#NoDrama