ci describes an automated process that builds a project whenever the codebase changes.
ci refers to the 'build & test' cycle. developers submits a “pull request” when their change is complete and, on approval of the pull request, the changes get merged into the master and ci servers rebuild the project from scratch every time a merge happens.
build the project every time a change is merged into master.
goal: immediately identify problems, so faulty code is obvious
run a software build with every change applied to the repository
scheduled builds, night;y builds, periodic builds, building exclusively on developer machines, not building at all.
maintain a core functioning codebase with developers creating branches off master which are merged back when changes are committed.
developers should be working on their own machines, using a branched copy of the code
all code must be in a repository. ci will not work with a file system hosted project
developers should be able to merge with master but not commit directly. No code should ever be added directly.
agree naming convention and merge into master through pull requests. branches should be pushed to the repository so that they can be shared
developer moves code into the vcs when enough progress has been made by adding the changed code to a commit with a relevant comment
organise source code changes by task-oriented units of work and submit changes as a task-level commit
keeping changes local to development for several days and stacking up changes before committing
goal: avoid build failures or complex troubleshooting
when master has reached a significant milestone, give it a name, typically this will incorporate the release version.
tag the build with a unique name so you can refer to run the same build at another time
not tagging builds, using revisions or branches as tags
goal:
there should be hooks in the version control system or polling in the CI system to force a new build when master changes. it is important that builds happen after every merged code change so a breaking commit can be identified immediately. developers should not be expected to kick off CI builds manually to avoid impacting development progress.
continually repeating the same processes with manual builds or partially automated builds requiring numerous manual configuration activities
the automated build discussed above occurs on the CI server using shared resources. VCS systems can be configured to perform a fast, stripped-down build locally first to pre-check the code
verify that your changes will not break the integration build by performing a pre-merge build—either locally or using Continuous Integration
checking in changes to a version-control repository without running a build on a developer's workstation
the results of a build are of special importance to the developer submitting the revised code. it is important that they are aware of the results as soon as they are available. positive and negative results are both important and developers should be trained to watch for all feedback before moving on to other tasks. the method(s) of providing feedback will vary depending on your organization’s infrastructure. methods to consider include:
send automated feedback from the CI server to development team members involved in the build
sending minimal feedback that provides no insight into the build failure or is non-actionable. sending too much feedback, including to team members uninvolved with the build. this is eventually treated like spam, which causes people to ignore messages
mistakes will happen and occasionally code will make it into master that breaks the build. the responsible pattern is to be ready to handle and resolve the problem immediately. the worst possible way to handle a failure in a build is to ignore it, or expect it to be resolved in a future build. you should consider taking all the following steps:
fix build errors as soon as they occur
allowing problems to stack up (build entropy) or waiting for them to be fixed in future builds
the build process is an excellent opportunity to generate documentation for your source code. developers tend to dislike writing documentation manually, and keeping documentation up to date manually can be time-consuming. the preferred approach is to incorporate documentation into your code, and then having the build process generate documentation from the code. this keeps documentation up-to-date and does not create more work for the development team.
generate developer documentation with builds based on checked-in source code
manually generating developer documentation. this is both a burdensome process and one in which the information becomes useless quickly because it does not reflect the checked-in source code
builds should happen the same way on all machines that run them. a build on a developer machine should run the same procedure as the CI server. therefore, train developers to not use the IDE build process. instead, the IDE can be configured to run the required build scripts so that building can still happen from the IDE. every project should include its own build scripts so it can be built from anywhere it is being worked on
create build scripts that are decoupled from IDEs, but can be invoked by an IDE. these build scripts will be executed by a CI system as well so that software is built at every change
relying on IDE settings for Automated Build. build cannot run from the command line
running a project build should be as simple as possible. it is best to have a simple CLI command that can run everything required for a build in the correct order. this ensures that both developers and servers use the exact same code in the exact same order. a single command-invoked build script can also be kept up to date with the current state of the project
build, compile, and testing phases can be time consuming for a developer. in order to support the development team, provide flags on the CLI command to limit the build process to fit their needs. for example, a developer might be updating a class and only needs to compile the code. they are not at a point where they need to test and build the whole project. the script could take a flag, such as --compileonly, to only perform the compilation process, but this way there are not individual commands for developers to know. everything goes through the same single CLI command
all build processes can be run through a single command
requiring people or servers to enter multiple commands and procedures in the deployment process, such as copying files, modifying configuration files, restarting a server, setting passwords, and other repetitive, error-prone actions
CI builds of master should be performed on servers (real or virtual) that are only tasked with building the project. these dedicated machines should have sufficient resources to build the project smoothly and swiftly to limit developer downtime. performing builds on a dedicated machine ensures a clean environment that doesn’t introduce unexpected variables. clean builds give a certain degree of reassurance that the project will build successfully when being deployed to other environments
run master builds on a separate dedicated machine or cloud service
relying on existing environmental and configuration assumptions (can lead to the "but it works on my machine problem")
configuration information that is specific to a machine or deployment environment should be a variable in any build and configuration scripts. these values should come from the build process so they can be build or environment specific. files that use this information should use tokens so that the build process can replace them with actual values
hardcoding values in configuration files or using GUI tools to do the same
all changes made to a database during development should be recorded via database scripts. the CI process can then run scripts as the project is built. it is an anti-pattern to expect any manual manipulation of a database during the build process. a database for the project should be able to be migrated to new changes regardless of timing or platform
all changes made to a database during development should be recorded into database scripts that can be run on every database on every platform hosting the project (including developer machines)
expecting database administrators to manually compare databases between platforms for changes, or to create on-off scripts that are only good for updating a single platform
every instance of the project should have its own version of the database with a relevant set of data. this should include development machines, build machines, testing machines, testing servers, and QA machines. no individual or server working with the project should have to worry about the integrity of their data, or the integrity of some other entity’s data, while coding, building, and testing
this is true of schema, but not necessarily data. the data for each environment should be a subset of the whole, and scrubbed of sensitive information
the CI process should include a way to import the data correctly into the database. any data import or manipulation should be scripted so it can be performed via command line. developers, testers, and build machines shouldn’t have to know the intricacies of the data. the build process in particular will need a command line command to call during the build
sharing development database
all scripts to perform database and data operations should be stored in version control. scripts should be named and/or annotated to refer to their appropriate version number to simplify automation. keeping old versions is useful for doing multiple scripts if necessary, and to maintain a history of changes
store the scripts for updating the database and its data in version control with the code and annotate appropriately
storing update scripts in an alternative location, i.e. a shared file server
tests should run with every build. The build scripts described in the sections above should include running tests against all code in the project. Tests should be as comprehensive as possible and can include unit tests, end-to-end tests, smoke tests, or UI tests.
testing should be done on all new code. tests for new code should be a requirement of a successful build. all code being merged into the project should be required to meet an appropriate code coverage level. all modern test running suites will have some form of coverage reporter that can be tied into the build process. any code submitted without sufficient test coverage should fail
write automated tests for each code path, both success testing and failure testing
the build is also an appropriate time to check for code quality and coverage percentages. the project team should have a minimum coverage percentage that the project is not allowed to dip below. if new code is merged in without sufficient tests, that percentage will be lower than expected, and that should trigger a failure
code quality is important to the long-term maintainability of a project, and the build step is a great place to verify code quality. the project team should have standards and best practices for the code so most of it looks and works the same way. the build step should verify that new code meets those standards
smoke tests are a subset of tests used to confirm the functionality of the most important elements of a project. they function as gatekeeper to confirm that building, full testing, or QA can continue. a suite of well-designed smoke tests can save QA personnel time and effort by checking the most likely candidates for failure first
create smoke tests that can be used by CI servers, developers, QA, and testing as a pre-check to confirm the most important functionality as they work, or before committing resources to a full build