Saturday, July 23, 2011

Anti Patterns of Continuous Integration

In this post I would try to bring my perspective of what are the top reasons where Continuous Integration fails or in many cases the benefits are not seen immediately. These are some of the anti patterns of continuous integration. Many cases I have seen that team looses hope and feel that it is an overhead which Agile practices has brought in. One of the reason may be that the Team / Team member or the management is looking for a quick solution without deep thinking about how Continuous Integration practice can be done really well. Please note that patience and strong will to do is the only way to become successful:

Reason 1- Every team situation, competency , technology, domain and dynamics is different. Just copying the CI Design,Architecture or roadmap from some other team is a wrong idea. Every team should have their own vision about how do they want the CI to look like. What all they want to run in the Continuous Integration System. Some teams try to run all the tools of inspection, all test cases together and starting with real continuous integration immediately without the infrastructure in place. That may not be a good idea in my opinion. It is always better to have a step by step approach. We can start with periodic build, incremental build and slowly move into the continuous integration approach. But if the team really have the competency then they can start immediately. It is really required to do this analysis.

Reason 2- Assuming that all the team members are very good in test coding skills is a wrong idea. Many times it is found that the test suite organization skills are poor in team members and there is no systematic training. The test code writing skills are not good in some cases due to which the test cases does not work properly. This is one of the reason where CI does not work effectively and there are frequent build failures.

Reason 3- User stories are not granular enough. During the user stories documentation I have seen that the stories has not been written with an overall perspective of granularity so that the developers can really check-in the code frequently and also if the story card can be realized effectively in short period. This is not quite easy and I am sure many people may have different opinion about this. One of the reason which I have seen for this is that the Business Analyst does not write the story as vertical slices from the user perspective. It does not depict the end to end functionality in the story. The requirements are sometimes written in a modular way. Due to this it is very difficult for a coder to checkin at frequent intervals. One pattern is observed that this issue is actually not known and we expect Continuous Integration to work.

Reason 4- CI should be actually run in the clone of production environment. But in big products which are used by many customers there are many environments and the multi-combinational testing, it is required that the strategy of testing is analyzed deeply.
Following is a good example -
Try to answer this

Each user on a computer system has a password which is generally 6-8 characters long, where each character is an uppercase letter or digit. Each password must contain at least 1 digit.
How many password combination is possible.?
The answer is:
2,68,4,483,063,360 possible variations of password exists
1 Test Case Design Time = 1 minute ,60 Test Cases = 60 minutes
1 day= 480 Test Procedure , 155 years to create and execute a complete Test.

In many cases I have seen the team don't analyze the stategy and the CI is run in some combination which is actually not the right representation of the production environment. Teams can use some of the techniques like parwise testing / OAT(Orthogonal Array Testing) to analyze the right combinations. But it is important to use these techniques with good judgement about the test sufficiency and coverage. The Test Strategy is really important to understand the real customer environment, where the test suites should be run and this work should happen during the Infrastructure readiness time for CI,not during the Test Execution time.

Reason 5- The Test Frameworks does not ensure that the SUT system architecture is taken into consideration while designing the overall automation framework. This creates issues, where the frameworks are not scaleable and there are issues related to overall test execution later.
Test Teams must ensure that the Automation framework considers all the aspects of testing like the system under test architecture, test inputs, test outputs, SUT precondition programe state, SUT postcondition programme state, environmental inputs and outputs, component dependency, APIs , capturing the results, analysis of the results etc, the various monitoing tools. Since the focus of the article is continuous integration, I may not elaborate more about the Test Frameworks. Important point is that for running continuous integration systems effectively it is important to have robust test automation system and the automation framework should be scaleable and reliable.
I have found this article very useful and I would highly recommend this for further reading.

Reason 6- I have seen sometimes that the team does not have good build scripting skills due to which there are issues. Sometimes the CI Tool chosen does not support administration console for build dependency management etc where it becomes need that there need to be good knowledge on build scripting.

Reason 7- Overall CI Architecture of the system is not very clear. Here CI Architecture does not mean Tool. It means how the different components in the overall system is organized and how it depend on each other and whether the Continuous Integration system is also organized in similiar manner. For big system it is very important that there should be multiple Component level CI System which further builds the overall System level CI. It is also seen that the build staging (Stages to ensure faster feedback in the respective CI System) is not very clear.
These causes issues where the CI takes longer time to execute and finally the team does not use the CI for actual feedback.

Reason 8- All the test cases and inspections tools are run together in one stage without any staging (/Build Pipeline) in place. It is advisable to have continous integration in stages so that feedback can be provided faster. Sometimes all the test cases including Unit Test Case, Component Level Test Cases, Integration Test Cases, Performance , Reliability Test Cases are run together in one system. You can refer to the previous post to know more detail about Staged Build:

Reason 8- Test Suite Organization Structure not followed appropriately- Sometimes it is seen that the Test Suites are not categorized properly as unit, functional, integration, non functional test cases due to which it cannot be staged and also it cannot give analysis or test results at feature level.

Reason 9- Local Build not being used effectively- There may be many reasons for local build not being used effectively. Lack of awareness, process or infrastructure issues may be the reason. You may refer to the following post to know the anti patterns of Local Build:

Reason 10- Lack of readiness for the tools and infrastructure- The team must ensure that before the actual execution of the project, the team should have all the infrastructure ready so that the time can be really productive during the actual work and focus on the functional work rather than infrastructure readiness. Sometimes I have seen team is not ready with the tools that need to be integrated with CI system, CI tool may not have been finalized , CI Server PC may be not of good configuration or slow and other related issues. These kind of issues diverts the attention of the team from the main work to tool related aspects which needs to be corrected.

Reason 11- Infrequent Check-ins- Team Members checks in the code weekly once or sometimes two week once. This defeats the purpose of continuous build and integration. There may be many reasons for this. It may be story card granularity issue, discipline issue or any other.

Reason 12- Broken builds are not given enough attention and the other team members keep checking- in the code without fixing the build.

Anti Patterns of Local Build / Private Build

Local Build is a very important practice in Agile which helps in getting sucessful build more frequently. To get more details about what is Local Build you may refer to the previous post:
However due to lack of awareness, process or discipline and due to other infrastucture issues there are some anti patterns to Local Build which is furnished below:

  • Running all the test cases in the Local Build- Developer runs the functional / integration test cases (typically that takes long time) as part of the local build.

  • Running all kinds of Inspection tools in Local Build- Developer runs all the inspection tools like Static Code Analyzer, Test Static Code Analyzer, Simian for Duplicate Code analysis, Useless code detector, Code Complexity Measurement tool etc. ( Just running the static code for the source code is enough. For the detailed feedback anyway the mainline build is there. However if there is a trend seen where the check-in often breaks due to issues from other tools, then the team should focus on competency improvement and ensure that all members are trained well to write good code and test code. It is not advisable to add so many inspection tools to the Local Build as it consumes more time.)

  • Unit Test Cases ,in reality is not Unit level cases- Test cases are categorized as unit test cases though it is not unit test cases. The key aspect of unit tests is having no dependency on outside entities like databases, which increases the time it takes to setup and run tests.

  • Test Suite Organization Structure not followed well- Sometimes it is seen that the way the test cases are organized , it is difficult to identify the unit test cases separately. When deeply analyzed, it is found that the team does not have any protocol to follow naming convention to identify the test suite and test cases for the test type.

  • Not using the same directory structure as configuration Library- Not using the same directory structure as the configuration library due to which the environment and other aspects are not same. The typical directory structure for source code and test suite can be :

  • IDE does not support a one click Local Build - Some teams don't use good IDE that can support automated check-in from the IDE and taking the latest code from the configuration library.(Especially in C projects)

  • Local build takes more than 3-5 minutes for providing result- I have seen sometimes Local / Private Build takes around 30 minutes to run and the developer has to wait to get the feedback and he /she sits idle. It creates productivity dip and the motivation level of developer is less to run local build every time.

  • Developer PC is not of good configuration- Developer PC configuration is not good due to which Local Build takes more time.

  • Members taking short cuts- Sometimes the team members in hurry takes short cuts. If this is not addressed, it becomes a culture and later correcting this becomes an issue. Running Local Build is a healthy practice and it is more of a culture.

Thursday, July 21, 2011

Lean Principles and Agile-Aren't they same

The Five key principles of Lean are identified as Value, Value Stream, Flow, Pull, Perfection. Agile principles depicts the key aspects of giving highest priority to customer, better collaboration at all levels, working software as primary measure and other aspects. Details mentioned in previous post:
Aren't Lean and Agile two names for the same thing?
Well, Lean and Agile share the same goals. The practices that make up the various agile methodologies also support the lean principles. In fact Lean takes a wider view of the entire business context in which the software development is done. Lean considers Agile software development as supporting practices for Lean Software development.

Value is defined by the customer. We must understand what is really considered as value from the customer's perspective.
Value Stream
Once the value is identified, we need to create a series of steps or processes to produce the product. This is also called the Value Stream. Each Step is either categorized as value-added , non value added but necessary, or non value added waste (which can be actually removed)
The development process must be designed in such a way that the step of activities should flow continuously. If the value chain stops at any point then it is evident that some waste is getting produced.
Let customer pull the value. It means that we don't start developing until it is really required. Decide as late as possible to develop anything.
Continuously identify areas were waste can be removed and the step of activities can be made perfect.

The Art of Lean Software Development - Curt Hibbs, Steve Jewett & Mike Sullivan

Saturday, July 16, 2011

Continuous Integration Build Metrics

Continuous Integration Build Metrics provide insight into the Build Optimization aspects, Health of the Source Code, Health of the Test Code, System Size. These indicators can greatly help in taking certain corrective and preventive action based on the continuous integration build results. Below are some of the key metrics indicators that can be obtained from CI -

1) Number of Source Lines of Code (SLOC) - This indicates the total software or system size in LOC. (Unit of Measurement - LOC)

2) Number of Test Code Lines of Code -This indicates the total test code size in LOC. (Unit of Measurement - LOC)

3) Compilation Time -Total Compilation time taken by the build. (Unit of Measurement - Hours / Minutes/ Seconds)

4) Total Test Execution -Time Total time taken to execute the Unit Test, Functional Test, Non Functional Test. (Unit of Measurement - Hours / Minutes/ Seconds )

5) Total Inspection -Time Total Time taken to execute the inspection tools. (Unit of Measurement - Hours / Minutes/ Seconds )

6) Total Deployment -Time Total Time taken to deploy the product / software into the target environment from the continuous integration machine. (Unit of Measurement - Hours / Minutes/ Seconds )

7) Total Build Time --Indicates the total time taken to run all the builds (including inspection , functional, non functional build) . (Unit of Measurement - Hours / Minutes/ Seconds )

8)Successful Build Rate -It is the division of the total number of successful build by the total builds at a give time interval. (Unit of Measurement - Percentage (%) )

9) Build Repair Rate -It indicates the time taken to repair a failed build. (Unit of Measurement - Hours / Minutes/ Seconds)

10) Version Control System Load Time -Indicates the time taken to check out/ update the project from the continuous integration build. It gives pointers about the network bandwidth, processor, memory, disk drive sufficiency and the peak time load of the version control system. (Unit of Measurement - Hours / Minutes/ Seconds )

11) Number of Errors open in the Source Code -Total Numbers of Static Tool Errors open in the test code of the Latest Build . The number of rules can be configured in the rules file of the tool. (Example- PC Lint, Checkstyle, PMD, Findbugs etc). (Unit of Measurement - Number )

12) Number of Errors open in the Test Code-Total Numbers of Static Tool Errors open in the source code of the Latest Build . The number of rules can be configured in the rules file of the tool. (Example- PC Lint, Checkstyle, PMD, Findbugs etc). (Unit of Measurement - Number )

13) Unit Testing Line Coverage -Indicates the total lines of source code covered through the unit testing. (Unit of Measurement - Percentage (%))

14) Unit Testing Function Coverage -Indicates whether the function (or subroutine) in the program been called through the unit testing. (Unit of Measurement - Percentage (%) )

15) Unit Testing Branch Coverage -Indicates the branch coverage through unit testing. (Unit of Measurement - Percentage (%) )

16) Functional Testing Line Coverage -Indicates the total lines of source code covered through the functional testing.(Unit of Measurement - Percentage (%))

17) Functional Testing Function Coverage -Indicates whether the function (or subroutine) in the program been called through the functional testing. (Unit of Measurement - Percentage (%))

18) Functional Testing Branch Coverage -Indicates the branch coverage through functional testing. (Unit of Measurement - Percentage (%) )

19) % of Duplicate Code Duplicate Code -Threshold can be set for block of code and Tools like Simian checks the duplication in percentage for the block of code set as threshold. It Ignores whitespace, curly braces, comments, imports, includes, package declarations, etc. (Unit of Measurement - Percentage (%)" )

20) Average Code Complexity -Indicates the average code complexity value of all the functions. (Unit of Measurement - Number)

21) Max Code Complexity -Indicates the maximum code complexity value among all the functions. (Unit of Measurement - Number )

Continuous Integration Build Duration or Time

The important point while running the CI build is to provide feedback as soon as possible. A good rule of thumb is to keep the integration build no more than 10 minutes. The next question would be - what all test cases and inspections tools should we run in the integration build. Well, the 10 minute build need not run all type of test and inspection tools. The key aspect is whatever test cases that could be run as soon as possible and that gives maximum confidence about the system can be run. For Example the Overall Build be separated into stages to get faster feedback. In the Stage 1 only the Unit Test Cases and the Smoke test cases can be run , In the Stage 2 the Functional test cases can be run and in the Stage 3 the non functional test cases can be run. The build is considered PASS from end to end perspective only when Stage 1, Stage 2 and Stage 3 is Pass.Some Rule of Thumb for reference:

Local Build / Private Build < 3 minutes
Stage 1 (Integration / Primary build )< 10 minutes
Stage 2 (Functional / Secondary Buid) <2-3 hours
Stage 3 (Non Functional Build ) - Depending about the type of test cases.
Inspection Build (Can be run in any stages depending on the time taken for each tool)

Typically the Stage 3 build are long running test cases for reliability, performance which has many combinations and it can also take 1 day - weeks depending on the type of test cases. For daily operations for development purpose, the Build till Stage 2 can be considered.

Saturday, July 2, 2011

Private Build/ Local Build

Local Build comprises of process of compilation of the added source code and test code, running of the unit cases, running of the inspection tool especially the Static Code analyzer (Like PMD, Checkstyle, PCLint etc). We can also run functional test cases and more inspection tools like duplicate code detection tool like simian etc., but this may increase the time of feedback for the developer. So it is advisable to keep only the key aspects (compilation, unit testing, static code check) as part of private / local build.
Local Build process can be described in following steps :
Step 1- Developer finishes the task (coding, unit testing, integration and functional test)
Step 2: - Developer takes the latest code from the configuration management library which comprises of the latest code changes.
Step 3- Runs the Local / Private Build with the latest code.
Step 4- If the build passes, then commit is done to the configuration library. If the build fails then rework is done and step 1 onwards is repeated.

Below antscript shows how to run a Local or Private Build using Ant. This is just for reference. Actual Build Script need to be changed based on the project situation.

The Thumb Rule is that Local Build should be as fast as possible. It can be less than 3 minutes. The practice of Local Build can ensure that whenever the developer checks-in the code, he / she does not break the main-line build and always completes the work with good quality. Some of the commonly occuring anti patterns, which should be avoided while following the local build practice are as follows:

  • Developer runs the functional / integration test cases (typically that takes long time) as part of the local build.

  • Wrong understanding of Unit Testing. Test cases are categorized as unit test cases though it is not unit test cases. The key aspect of unit tests is having no dependency on outside aspects like databases, which increases the time it takes to setup and run tests.

  • Some teams don't use good IDE that can support automated check-in from the IDE and taking the latest code from the configuration library.(Especially in C projects)

  • Sometimes there is tendency to run all kind of test cases and also the inspection tools in the private build. This increases the execution time for local build.

  • Not using the same directory structure as the configuration library due to which the environment and other aspects are not same. The typical directory structure for source code and test suite can be :

    This is also is one of the reason where the private build passes but the mainline build still fails.

  • Developer Machine is not of good configuration and it takes long time to execute the local build.

  • Taking Short-cuts. Lack of Discipline.

There is obviously no debate that Local Build Practice can bring lot of benefits to the team, if used effectively. It also compliments to the continuous integration practices. This practice can really ensure that the team is efficient and can deliver good quality with certainty.

Exploratory Testing in Agile Teams

Sometimes the Test Team has some doubts that when all the test cases are automated and when at each story level there is testing happening, "do we really need to exploratory testing?". "Is exploratory testing applicable in incremental and iterative testing".

The answer is "Yes".

Irrespective of how much automation we can bring in and how much planned testing we do, Exploratory testing has its own place and it must be done to ensure that there is no residual defects in the system. Exploratory Testing can be well planned. During the Iteration the Developer and the Tester and sit together and conduct the exploratory testing. Automation can be used to do test setup, data generation, repetitive tasks. In the exploratory testing , each tester / developer can use his / her own skills and knowledge of the system to find the inside defects of the system. Some of the tips for doing the exploratory testing systematically is mentioned below:
  • Understand the critical features that the customer is going to use. Give focus to those modules first.
  • As a Tester /Developer, the time for the overall exploratory testing is limited. Should have a test strategy in place comprising of the focused testing approach for the critical features, troubled features (explained in the end), how to generate the test data, Techniques usage (if any)- like equivalance partitioning, orthogonal array testing techniques, Boundary value analysis, error guessing mthod etc. This strategy can help you to use the limited time very effectively.
  • Think about what the normal user would do and do those steps. Have an attitude of breaking the system and be sure that there are deep defects which are not unearthed and this is your chance to do it.
  • Use some random approach to do the testing.
  • Think about what the expert and a normal user would do.
  • Look into the troubled features during the normal testing (means during the normal testing, there may be features on which more defects may have come). Target those features and conduct some exploratory testing.

Exploratory testing may provide good insight into any of the defects which were not found during the normal testing. Always plan for exploratory testing in all the agile projects irrespective of the time availability. Plan for exploratory testing as an activity during the release and iteration planning time. It always gives good results and gives more confidence about the system.


Architecting for Continuous Delivery

This short article will provide details about the various architecture specific requirements for good implementation of continuous delivery...