This is the first piece in a series on the practice of continuous integration. Continuous Integration is an essential enabler of high performance software delivery processes, as it creates the release candidates you will deliver.
Avg Reading Time: 5 minutes
The primary goal of a continuous integration process is to provide rapid feedback on changes to the development team. When a change is introduced into the codebase, the team needs to know whether it is probably good and reasonable to deploy for further testing. If not, the change needs to be fixed before introducing another change. The process for taking integrating changes from the team onto the main release branch, producing a build, and qualifying (and fixing) it is called continuous integration. The build and qualifying steps must be performed with reliable automation and should take 10 minutes or less to maintain the rapid feedback loop.
In 2019, the Continuous Integration for application code and libraries looks like:
- merge changes to the main source branch from a short-lived task branch
- compile or transpile code to releasable form
- run unit tests
- package into a releasable artifact
- publish artifact to an artifact repository
If this looks like what you’re doing now (or maybe have been doing for 10 years), you can probably skip the rest of this post. If not, read on.
Trunk-based development and task branches
The most important practice in Continuous Integration is to integrate changes into the application’s main source code branch frequently. This main branch may be named
master, but whatever the name, this is the place where the development team collects its efforts to produce new features and fixes. This main branch is the source of releases and is often referred to as trunk-based development.
Integrating changes into the main branch accomplishes several positive things, let’s discuss two:
- value becomes available in small increments, with lower risk
- developers reduce code integration effort
Making value available in small increments
When developers merge improvements to a codebase incrementally, the value provided by that incremental change becomes available for use because that change will be built into a release candidate. Let’s put this in the context of a large change such as adding support for a new use case to an application. The developer (or developers) might decompose this effort into a few tasks that are expected to take at most a few hours to a couple of days each:
- add additional unit tests in the area they’re about to change
- refactor internal concepts to match where the product is going
- implement the first half of the use case
- implement the second half of the use case
The application or library should work as each of these tasks are completed. When using Continuous Integration, the developer(s) can and should merge this work back to the main branch as it completes.
Adding unit tests and refactoring internal concepts provides value to the rest of the development team and improves the codebase. When the change author merges those changes, they will be able to get feedback from the rest of the team and the build process on whether those changes are good.
Merging a working, first half of the complete use case can provide value, too. Suppose you’re adding an ‘Account Details’ function to an application. The first half of the use case could be providing a read-only view of the data. This likely provides value to customers and is independently releasable. Why not make it available for release now?
What if implementing the second half of the use case is tricky or doesn’t go well?
If the first half has been merged and tested independently, you still have the option to ship it. This is way better than sifting through a morass of uncommitted changes on a developer’s workstation trying to figure out what you can salvage. Yay for risk management!
On the topic of “not going well,” a common issue developers encounter with long-lived branches is integrating those changes back into the main code base.
Reduce code integration effort
Developers create branches so they can work in isolation for some time on an enhancement or bug fix. However, there is tension here because that isolation enables drift as team members introduce changes into the main code base. Converging drift between branches requires developers to spend time merging others’ changes to or from their isolated branches and the longer a branch is active, the more changes will need to be integrated. This isn’t necessarily ‘just’ a problem with managing source code. When core concepts and implementations of an application or library change, those changes may be very challenging to integrate into other branches because refactoring tools will not be able to assist the developer.
To balance this tension, developers should avoid creating long lived branches. Long-lived branches will inevitably diverge significantly from the main source code branch line in an active code base. Create a branch per task instead.
In the previous example of adding a function to view and edit account details you could have a branch for each task. If the whole effort should only take a few days, you might use a branch for the entire use case and then merge back to main as each task completes. This a “branch per task” branching model.
Since the task branch is only active for a few hours or days, divergence between the source branch point is minimized. This simplifies the review and integration process back to the main source branch.
Now, let’s produce a releasable artifact!
Producing a release artifact
A project’s Continuous Integration process should be the canonical process producing release artifacts used by customers.
After a change is merged to the main source tree, an automated continuous integration process should:
- checkout a copy of the source code to a clean workspace
- compile or transpile the code to releasable form
- run unit tests
- package release binaries into a releasable artifact
- publish the artifact to an artifact repository
Note: Docker containers make great portable, hygienic build environments
The CI process artifact should include the (compiled) code packaged in an archive along with application-specific tools such as scripts required to run the application. The CI process might also produce environment specific configuration artifacts for use in downstream deployment steps. The archive might be a zip, tgz, or Docker image.
Some organizations now build a Docker image for an application and treat that as the shippable application artifact. The Docker image format has been designed for this purpose and is straightforward to transport and run.
Once the release artifacts have been packaged, they should be published to an artifact repository like npm, an RPM repository, or Docker registry.
The continuous integration process may look a bit different depending on the technologies you are using, but always keep sight on the goal of providing rapid feedback to the development team and building the actual artifacts that will be released and used by customers.
Receive #NoDrama articles in your inbox whenever they are published. Reply to Stephen and the QualiMente team when you want to dig deeper into a topic.