Saturday, July 23, 2011

Anti Patterns of Continuous Integration

In this post I would try to bring my perspective of what are the top reasons where Continuous Integration fails or in many cases the benefits are not seen immediately. These are some of the anti patterns of continuous integration. Many cases I have seen that team looses hope and feel that it is an overhead which Agile practices has brought in. One of the reason may be that the Team / Team member or the management is looking for a quick solution without deep thinking about how Continuous Integration practice can be done really well. Please note that patience and strong will to do is the only way to become successful:

Reason 1- Every team situation, competency , technology, domain and dynamics is different. Just copying the CI Design,Architecture or roadmap from some other team is a wrong idea. Every team should have their own vision about how do they want the CI to look like. What all they want to run in the Continuous Integration System. Some teams try to run all the tools of inspection, all test cases together and starting with real continuous integration immediately without the infrastructure in place. That may not be a good idea in my opinion. It is always better to have a step by step approach. We can start with periodic build, incremental build and slowly move into the continuous integration approach. But if the team really have the competency then they can start immediately. It is really required to do this analysis.

Reason 2- Assuming that all the team members are very good in test coding skills is a wrong idea. Many times it is found that the test suite organization skills are poor in team members and there is no systematic training. The test code writing skills are not good in some cases due to which the test cases does not work properly. This is one of the reason where CI does not work effectively and there are frequent build failures.

Reason 3- User stories are not granular enough. During the user stories documentation I have seen that the stories has not been written with an overall perspective of granularity so that the developers can really check-in the code frequently and also if the story card can be realized effectively in short period. This is not quite easy and I am sure many people may have different opinion about this. One of the reason which I have seen for this is that the Business Analyst does not write the story as vertical slices from the user perspective. It does not depict the end to end functionality in the story. The requirements are sometimes written in a modular way. Due to this it is very difficult for a coder to checkin at frequent intervals. One pattern is observed that this issue is actually not known and we expect Continuous Integration to work.

Reason 4- CI should be actually run in the clone of production environment. But in big products which are used by many customers there are many environments and the multi-combinational testing, it is required that the strategy of testing is analyzed deeply.
Following is a good example -
Try to answer this

Each user on a computer system has a password which is generally 6-8 characters long, where each character is an uppercase letter or digit. Each password must contain at least 1 digit.
How many password combination is possible.?
The answer is:
2,68,4,483,063,360 possible variations of password exists
1 Test Case Design Time = 1 minute ,60 Test Cases = 60 minutes
1 day= 480 Test Procedure , 155 years to create and execute a complete Test.

In many cases I have seen the team don't analyze the stategy and the CI is run in some combination which is actually not the right representation of the production environment. Teams can use some of the techniques like parwise testing / OAT(Orthogonal Array Testing) to analyze the right combinations. But it is important to use these techniques with good judgement about the test sufficiency and coverage. The Test Strategy is really important to understand the real customer environment, where the test suites should be run and this work should happen during the Infrastructure readiness time for CI,not during the Test Execution time.

Reason 5- The Test Frameworks does not ensure that the SUT system architecture is taken into consideration while designing the overall automation framework. This creates issues, where the frameworks are not scaleable and there are issues related to overall test execution later.
Test Teams must ensure that the Automation framework considers all the aspects of testing like the system under test architecture, test inputs, test outputs, SUT precondition programe state, SUT postcondition programme state, environmental inputs and outputs, component dependency, APIs , capturing the results, analysis of the results etc, the various monitoing tools. Since the focus of the article is continuous integration, I may not elaborate more about the Test Frameworks. Important point is that for running continuous integration systems effectively it is important to have robust test automation system and the automation framework should be scaleable and reliable.
I have found this article very useful and I would highly recommend this for further reading.

Reason 6- I have seen sometimes that the team does not have good build scripting skills due to which there are issues. Sometimes the CI Tool chosen does not support administration console for build dependency management etc where it becomes need that there need to be good knowledge on build scripting.

Reason 7- Overall CI Architecture of the system is not very clear. Here CI Architecture does not mean Tool. It means how the different components in the overall system is organized and how it depend on each other and whether the Continuous Integration system is also organized in similiar manner. For big system it is very important that there should be multiple Component level CI System which further builds the overall System level CI. It is also seen that the build staging (Stages to ensure faster feedback in the respective CI System) is not very clear.
These causes issues where the CI takes longer time to execute and finally the team does not use the CI for actual feedback.

Reason 8- All the test cases and inspections tools are run together in one stage without any staging (/Build Pipeline) in place. It is advisable to have continous integration in stages so that feedback can be provided faster. Sometimes all the test cases including Unit Test Case, Component Level Test Cases, Integration Test Cases, Performance , Reliability Test Cases are run together in one system. You can refer to the previous post to know more detail about Staged Build:

Reason 8- Test Suite Organization Structure not followed appropriately- Sometimes it is seen that the Test Suites are not categorized properly as unit, functional, integration, non functional test cases due to which it cannot be staged and also it cannot give analysis or test results at feature level.

Reason 9- Local Build not being used effectively- There may be many reasons for local build not being used effectively. Lack of awareness, process or infrastructure issues may be the reason. You may refer to the following post to know the anti patterns of Local Build:

Reason 10- Lack of readiness for the tools and infrastructure- The team must ensure that before the actual execution of the project, the team should have all the infrastructure ready so that the time can be really productive during the actual work and focus on the functional work rather than infrastructure readiness. Sometimes I have seen team is not ready with the tools that need to be integrated with CI system, CI tool may not have been finalized , CI Server PC may be not of good configuration or slow and other related issues. These kind of issues diverts the attention of the team from the main work to tool related aspects which needs to be corrected.

Reason 11- Infrequent Check-ins- Team Members checks in the code weekly once or sometimes two week once. This defeats the purpose of continuous build and integration. There may be many reasons for this. It may be story card granularity issue, discipline issue or any other.

Reason 12- Broken builds are not given enough attention and the other team members keep checking- in the code without fixing the build.

No comments:

Architecting for Continuous Delivery

This short article will provide details about the various architecture specific requirements for good implementation of continuous delivery...