When you adopt Continuous Delivery, you’ll probably be deploying and testing your software a lot more than in your old delivery model.

Average Reading Time: 6 minutes

The scale of change

For example if you currently deploy updates to a test environment weekly and then (hopefully) to a production environment monthly, you’re accustomed to about five deploys per month. Of course, those are the ‘planned’ deployments. Don’t forget the unplanned deployments to fix that new feature that didn’t quite work or to deploy those critical security updates. So maybe there are 8 or 10 deploys per month.

With continuous delivery, you should try and deploy each change. If your team is merging three changes to trunk each day and an average of 22 working days per month, then a continuous delivery model will trigger 65 or more deploys per month. In fact, when you adopt CD you’ll probably see your batch (change) size go down and you may end up with 100 changes and deploys per month. This is because the team will adjust to not having to wait for feedback from a large-batch deploy at the end of the week.

So…We’re looking at approximately a 10x increase in deployments!

Increasing from 8-10 deploys per month to 65-100 is a Big Deal. There’s a lot of stuff you can ‘muscle’ through (or ignore) when it happens 10 times a month that will not have a chance of working when you try to do it 100 times per month.

You’ll need to address this dramatic change in deployment process execution from a few angles, the most important being:

  1. efficiency
  2. safety

A common and recommended first step towards a robust automated deployment process is to automate a small experiment that gates a full deployment to the environment. We will use this experiment to control the risk of deployment and take a (huge) step towards automating the full deployment.

The ‘Deploy’ step of the generalized software control loop is updated to look like:

This experimental process tests a single instance of each release candidate before deploying to the full environment. Let’s explore why and how to do that now.

Efficiency

The first and most obvious problem is literally needing to perform about 10x as many deployments as before. You can no longer afford to have any manual steps in the deployment process.

Start by automating the deployment process for the ‘core’ of your application, prioritized by (desired) frequency of update. Let’s break this down with an example.

Suppose you have a web application that has a load balancer, application, and database tiers. It’s likely that 90% of changes or more happen in the application tier. If the application is packaged into a ‘good’ release artifact, that artifact will contain those changes and should drop onto your VM/container/function deployment platform easily. If you don’t have good release artifacts, you probably need to fix that first. Feel free to reach out for advice on this topic.

The core of the application’s deployment automation should:

  1. combine the release candidate’s BUILD_ID with its deployment descriptor to plan the deployment of a dedicated instance of the service using the release candidate; often this can be accomplished by providing an updated virtual machine or Docker image identifier to a templated descriptor.
  2. launch a new instance (VM, container, function) of the service using automation tooling appropriate for your platform
  3. trigger a sanity test of the release candidate

That’s it.

You may want to manage the load balancer (relatively easy) or database (relatively hard) and that’s great. You’ll get to that. Keep the scope of your initial delivery automation focused on the part that delivers the bulk of your changes to customers when starting out. Depending on your platform, you may need to pull integration with the load balancer into scope in this first pass. That’s ok, just keep your scope to the minimum. The app tier tends to be the stateless, low risk part, too.

This automation should be executable from team members’ local workstations (at least in dev) and from whatever automation tool will be delivering these changes (Jenkins, XL Release, Spinnaker, Code Deploy, CodeShip, etc).

Local execution of deployment processes ‘ is surprisingly important in achieving lift-off with delivery automation. First, it shortens edit/test cycles for deployment automation and helps you develop deployment code in isolation like you do with the rest of your code. Second, sometimes you’ll need to debug some things and it’s easier to do that locally, especially in the beginning. Third, you’ll want to start practicing sharing deployment automation via code with your deployment tooling.

The really neat part of this automation is that it can start to unload people doing the manual deployment immediately. Enabling people to use your automated deployment tooling via the automation server or even locally is a great way to gather feedback on that tooling and build trust with the experts of that process. There’s usually a short step between deploying a single instance and deploying N instances for the environment and integrating them into the application’s load balancer. We’ll discuss a few options on this topic in a subsequent post.

Now that you have a path to deliver changes fast, focus on safety.

Safety

Just like now, some of the changes coming down the pipeline aren’t going to be good. What is different is that we’re (finally) now we’re going to do something about it.

Start by separating the actions of deployment from release. We want to reduce the risk of a ‘bad’ release candidate. If you adopted the process recommended in the Efficiency section, you’ve already taken the initial steps by using automation to deploy a dedicated instance for each release candidate.

Now we’ll enhance the deployment process to:

  1. deploy a release candidate to a new (possibly ephemeral instance)
  2. perform a quick ‘sanity’ test
  3. record the outcome of the sanity test
  4. if successful, trigger release of the build to rest of the environment

The promotion process for a web application would entail replacing old instances of the service with new instances in a load balancer. If you want to place the instance used for sanity testing into service that’s fine too. If you observe the behavior of that instance and use that information to gate further deployment through the environment, you can call that instance a ‘canary’.

The ‘sanity test’ suite is very important because it provides the signal to the automated release control function. In the CD world, this cannot be a person and we need it to catch basic problems. Let’s discuss how to build one from the ground up.

The simplest sanity test is to start up the new software and see if it stays running. Second, you can check the application’s healthcheck endpoint. Third, you can execute a small suite of automated functional tests via the application’s UI or API. Five to fifteen tests that verify availability and basic functioning of the application’s core functionality should be enough for a sanity test suite. We’ll dig into a more complete test suite organization in a subsequent post. For now, ensure your sanity test suite covers enough to keep people from saying things like, “the test environment is broken.”

The sanity test isn’t meant to be exhaustive and it won’t catch everything. However, it should help you build a release control mechanism that keeps changes flowing into test and follow-on environments with not only a reasonable level of safety, but one you can tune to your own needs.

Benefits

Has your mindset shifted?

This deployment automation and release control will be well-exercised in low risk environments. Since the automated deployment process has been used for each change, it’s been tested about 10x more than in the previous model. This should raise your confidence that production deployments will happen correctly and quickly.

Additionally, once you decouple the ‘deploy’ from ‘release’ actions, you can perform testing in production before releasing it to customers. This gives you an extremely powerful tool for managing the risk of a new release.

Once you’ve automated the core of your deployment process and decoupled ‘deploy’ from ‘release’, the majority of concern about adopting CD should melt away. If you have concerns remaining, I’d love to hear them.

Go Fast, Safely my friends. When you have questions, hit me up.

#NoDrama