GSL IT

Change Management

A Customer asked us the other day for advice about Change Control, so we thought a post would be useful. We're talking mostly about Change from an engineering standpoint, although much of this applies to other domains as well. Change is necessary to ensure that systems are kept up-to-date with the latest component upgrades (crucially important where security is involved) and also to add new features and functionality. Change imports risk, however - the system may be broken by a Change, or implementation might break a different system.

TL;DR: Ensure Change is clearly defined, with rollback plan. Implement version control for all system components. Designer and checker to be independent. Consider a Board to declash and approve Changes. Functional tests verify Change does what was wanted, Regression ensures nothing else was broken. Monitor after testing.

There are several strategies to reduce the risk of Change, and we'll discuss some of these here...

1. Specification

The definition of a Change needs to clearly set out the intended purpose of the change so that everyone involved understands what needs to be achieved, and also any pre-requisites (sometimes a Change will require that another is deployed first, for example). The Change steps will need to be clearly defined, and for large Changes there may be checkpoints between stages where the decision to continue or revert will be considered. It is vitally important for any type of Change that the way in which it can be rolled back (or undone) is defined - sadly, errors are made in Change design (or deployment) and having a planned method for putting everything back as it was is essential.

Version Control

This could be the subject of a whole page but suffice it to say in summary here that being able to understand the exact configuration of a system at any moment has great value in troubleshooting. To this end, components within systems need to have a version identifier which is easily examined (perhaps by including it in log output or a dedicated database table field) and a Change specification should include component version numbers before and after each Change implementation. There are many tools to help with this process.

2. Designer

Changes must be designed by competent people to reduce the likelihood of errors (in some industries, such people are assessed and licensed to be able to design certain types of Change).

3. Checker

It is useful to have someone independent check the design of a Change - a fresh pair of eyes will often notice an error (ditto re licensing).

4. Approver

Changes should only be implemented when formally approved. Different industries have different ways of approaching this - in the IT world, companies often convene a Change Board, comprising stakeholders from related teams, who review and approve Changes (and, crucially, look for clashes between planned Changes). Sometimes these Boards will have different classifications of Change dependant upon the scale of the Change (perhaps including Major, Minor, Break Fix, Emergency) and sometimes the different classifications will command different review periods.
Graphic illustrating that data changes can break systems too. We see a simple address change introducing errors such as a comma being added breaking a csv import used by another team, a longer address not fitting in a database field and the address having four words breaking some parsing done by another team.

Graphic illustrating that data changes can break systems too. We see a simple address change introducing errors such as a comma being added breaking a csv import used by another team, a longer address not fitting in a database field and the address having four words breaking some parsing done by another team.

5. Testing - types

Before deploying a Change to any system, it is vitally important to test thoroughly to make sure that (a) the Change does what it is intended to and (b) it hasn't broken something else. (a) is often called Functional Testing and is simply a case of exercising the parts of the system which were altered by the Change to make sure that each part of the designed Change works in the way the designer of the Change wanted. (b) is often called Regression Testing and is more complicated... in order to make sure that no other part of the system has been affected by a change it is necessary to have a collection of tests (known as a Suite) which exercise the functionality of the whole system. It is prudent to run the Regression Tests before starting any Change, to make sure that any issues with the system are identified before anything is modified - this also has the benefit of verifying that the Regression Tests work correctly(!).

6. Testing - environment

In order to be able to test a Change (which could have errors in it) it is important to have somewhere to test it that does not affect production systems. In the IT world, such a place is often known as a Non-Production Environment and to be truly effective it needs to be identical to the production system - the same hardware, operating system and application software (including patches). NB it is impossible to make a non-production environment completely identical to the production system in reality.

7. Monitoring

Because functional and regressional testing sometimes have errors, it is useful to monitor the system which has been modified, and also others which could conveivably have been affected, for a period sibsequent to Change deployment.

Please feel free to contact us if you need help implementing Change Management for your business...