continuous integration

about

ci describes an automated process that builds a project whenever the codebase changes.

ci refers to the 'build & test' cycle. developers submits a “pull request” when their change is complete and, on approval of the pull request, the changes get merged into the master and ci servers rebuild the project from scratch every time a merge happens.


patterns & anti-patterns

build software at every change

build the project every time a change is merged into master.
goal: immediately identify problems, so faulty code is obvious

pattern

run a software build with every change applied to the repository

anti-pattern

scheduled builds, night;y builds, periodic builds, building exclusively on developer machines, not building at all.

version control

maintain a core functioning codebase with developers creating branches off master which are merged back when changes are committed.

private workspace

developers should be working on their own machines, using a branched copy of the code

repository

all code must be in a repository. ci will not work with a file system hosted project

master

developers should be able to merge with master but not commit directly. No code should ever be added directly.

branching policy

agree naming convention and merge into master through pull requests. branches should be pushed to the repository so that they can be shared

task-level commit

developer moves code into the vcs when enough progress has been made by adding the changed code to a commit with a relevant comment

pattern

organise source code changes by task-oriented units of work and submit changes as a task-level commit

anti-pattern

keeping changes local to development for several days and stacking up changes before committing
goal: avoid build failures or complex troubleshooting

tag build

when master has reached a significant milestone, give it a name, typically this will incorporate the release version.

pattern

tag the build with a unique name so you can refer to run the same build at another time

anti-pattern

not tagging builds, using revisions or branches as tags

goal:

build management

automated build

there should be hooks in the version control system or polling in the CI system to force a new build when master changes. it is important that builds happen after every merged code change so a breaking commit can be identified immediately. developers should not be expected to kick off CI builds manually to avoid impacting development progress.

pattern

  • automate all activities to build software from a source without manual configuration
  • create build scripts that will be executed by a CI system so that software is built at every change

anti-pattern

continually repeating the same processes with manual builds or partially automated builds requiring numerous manual configuration activities

build practices

pre-merge build

the automated build discussed above occurs on the CI server using shared resources. VCS systems can be configured to perform a fast, stripped-down build locally first to pre-check the code

pattern

verify that your changes will not break the integration build by performing a pre-merge build—either locally or using Continuous Integration

anti-pattern

checking in changes to a version-control repository without running a build on a developer's workstation

continuous feedback

the results of a build are of special importance to the developer submitting the revised code. it is important that they are aware of the results as soon as they are available. positive and negative results are both important and developers should be trained to watch for all feedback before moving on to other tasks. the method(s) of providing feedback will vary depending on your organization’s infrastructure. methods to consider include:

  • Email
  • Hipchat
  • Slack
  • SMS
  • Web Push Notifications
  • Campfire
  • Any infrastructure tool with extensions that your organization uses

pattern

send automated feedback from the CI server to development team members involved in the build

anti-pattern

sending minimal feedback that provides no insight into the build failure or is non-actionable. sending too much feedback, including to team members uninvolved with the build. this is eventually treated like spam, which causes people to ignore messages

expeditious fixes

mistakes will happen and occasionally code will make it into master that breaks the build. the responsible pattern is to be ready to handle and resolve the problem immediately. the worst possible way to handle a failure in a build is to ignore it, or expect it to be resolved in a future build. you should consider taking all the following steps:

  • fix broken builds immediately - although it is the team's responsibility, the developer who recently committed code must be involved in fixing the failed build. it is possible the problem was a result of a lack of knowledge, so it is a good idea to have a seasoned developer available to assist if needed.
  • always pull master and build - developers should pull the latest code into their branch from master before pushing committed code. after pulling master into their own branch, they should run unit tests and build locally to ensure nothing pulled from master breaks their code. this also allows the developer a chance to fix conflicts that result from the merge before the merge gets to master.
  • don’t pull broken code - if master is broken, notify the team. developers should avoid pulling master into their own branch while it is broken. development time could be wasted by other developers struggling with bad code that will be changed shortly.

pattern

fix build errors as soon as they occur

anti-pattern

allowing problems to stack up (build entropy) or waiting for them to be fixed in future builds

developer documentation

the build process is an excellent opportunity to generate documentation for your source code. developers tend to dislike writing documentation manually, and keeping documentation up to date manually can be time-consuming. the preferred approach is to incorporate documentation into your code, and then having the build process generate documentation from the code. this keeps documentation up-to-date and does not create more work for the development team.

pattern

generate developer documentation with builds based on checked-in source code

anti-pattern

manually generating developer documentation. this is both a burdensome process and one in which the information becomes useless quickly because it does not reflect the checked-in source code

build configuration

independent build

builds should happen the same way on all machines that run them. a build on a developer machine should run the same procedure as the CI server. therefore, train developers to not use the IDE build process. instead, the IDE can be configured to run the required build scripts so that building can still happen from the IDE. every project should include its own build scripts so it can be built from anywhere it is being worked on

pattern

create build scripts that are decoupled from IDEs, but can be invoked by an IDE. these build scripts will be executed by a CI system as well so that software is built at every change

anti-pattern

relying on IDE settings for Automated Build. build cannot run from the command line

single command

running a project build should be as simple as possible. it is best to have a simple CLI command that can run everything required for a build in the correct order. this ensures that both developers and servers use the exact same code in the exact same order. a single command-invoked build script can also be kept up to date with the current state of the project

build, compile, and testing phases can be time consuming for a developer. in order to support the development team, provide flags on the CLI command to limit the build process to fit their needs. for example, a developer might be updating a class and only needs to compile the code. they are not at a point where they need to test and build the whole project. the script could take a flag, such as --compileonly, to only perform the compilation process, but this way there are not individual commands for developers to know. everything goes through the same single CLI command

pattern

all build processes can be run through a single command

anti-pattern

requiring people or servers to enter multiple commands and procedures in the deployment process, such as copying files, modifying configuration files, restarting a server, setting passwords, and other repetitive, error-prone actions

dedicated resources

CI builds of master should be performed on servers (real or virtual) that are only tasked with building the project. these dedicated machines should have sufficient resources to build the project smoothly and swiftly to limit developer downtime. performing builds on a dedicated machine ensures a clean environment that doesn’t introduce unexpected variables. clean builds give a certain degree of reassurance that the project will build successfully when being deployed to other environments

pattern

run master builds on a separate dedicated machine or cloud service

anti-pattern

relying on existing environmental and configuration assumptions (can lead to the "but it works on my machine problem")

externalize and tokenize configuration

configuration information that is specific to a machine or deployment environment should be a variable in any build and configuration scripts. these values should come from the build process so they can be build or environment specific. files that use this information should use tokens so that the build process can replace them with actual values

pattern

  • externalize all variable values from the application configuration into build-time properties
  • use tokens so the build process knows where to add variable values

anti-pattern

hardcoding values in configuration files or using GUI tools to do the same

database

scripting database changes

all changes made to a database during development should be recorded via database scripts. the CI process can then run scripts as the project is built. it is an anti-pattern to expect any manual manipulation of a database during the build process. a database for the project should be able to be migrated to new changes regardless of timing or platform

pattern

all changes made to a database during development should be recorded into database scripts that can be run on every database on every platform hosting the project (including developer machines)

anti-pattern

expecting database administrators to manually compare databases between platforms for changes, or to create on-off scripts that are only good for updating a single platform

database sandbox

every instance of the project should have its own version of the database with a relevant set of data. this should include development machines, build machines, testing machines, testing servers, and QA machines. no individual or server working with the project should have to worry about the integrity of their data, or the integrity of some other entity’s data, while coding, building, and testing

this is true of schema, but not necessarily data. the data for each environment should be a subset of the whole, and scrubbed of sensitive information

the CI process should include a way to import the data correctly into the database. any data import or manipulation should be scripted so it can be performed via command line. developers, testers, and build machines shouldn’t have to know the intricacies of the data. the build process in particular will need a command line command to call during the build

pattern

  • create a lightweight version of your database (only enough records to test functionality)
  • use a command line interface to populate the local database sandboxes for each developer, tester, and build server
  • use this data in development environments to expedite test execution

anti-pattern

sharing development database

update scripts stored In version control

all scripts to perform database and data operations should be stored in version control. scripts should be named and/or annotated to refer to their appropriate version number to simplify automation. keeping old versions is useful for doing multiple scripts if necessary, and to maintain a history of changes

pattern

store the scripts for updating the database and its data in version control with the code and annotate appropriately

anti-pattern

storing update scripts in an alternative location, i.e. a shared file server

testing and code quality

automated tests

tests should run with every build. The build scripts described in the sections above should include running tests against all code in the project. Tests should be as comprehensive as possible and can include unit tests, end-to-end tests, smoke tests, or UI tests.

testing should be done on all new code. tests for new code should be a requirement of a successful build. all code being merged into the project should be required to meet an appropriate code coverage level. all modern test running suites will have some form of coverage reporter that can be tied into the build process. any code submitted without sufficient test coverage should fail

pattern

write automated tests for each code path, both success testing and failure testing

anti-pattern

  • not running tests
  • no regression testing
  • manual testing

build quality threshold

the build is also an appropriate time to check for code quality and coverage percentages. the project team should have a minimum coverage percentage that the project is not allowed to dip below. if new code is merged in without sufficient tests, that percentage will be lower than expected, and that should trigger a failure

code quality is important to the long-term maintainability of a project, and the build step is a great place to verify code quality. the project team should have standards and best practices for the code so most of it looks and works the same way. the build step should verify that new code meets those standards

pattern

  • notify team members of code aberrations such as low code coverage or the use of coding anti-patterns
  • fail a build when a project rule is violated
  • use continuous feedback mechanisms to notify team members

anti-pattern

  • deep dive reviews of every code change
  • manually calculating or guesstimating code coverage

automated smoke test

smoke tests are a subset of tests used to confirm the functionality of the most important elements of a project. they function as gatekeeper to confirm that building, full testing, or QA can continue. a suite of well-designed smoke tests can save QA personnel time and effort by checking the most likely candidates for failure first

pattern

create smoke tests that can be used by CI servers, developers, QA, and testing as a pre-check to confirm the most important functionality as they work, or before committing resources to a full build

anti-pattern

  • manually running functional tests
  • forcing QA to run the full suite before every session
  • manually checking deployment sensitive sections of the project

continuous integration