How Test Automation combined with Batch Testing and Defect Analysis at the Database level can optimize your DevOps process.
With full end-to-end testing through the User Interface already established at many organisations, the challenge of “shifting-defects-left” in legacy applications still exists. Service Virtualisation is an option to test multi-tier applications with a legacy core but this level of abstraction can often leave a substantial body of mission critical application logic unchecked, presenting a significant risk to your project. This article explores a several options available to resolve these challenges.
A thought- inspiring blog by ARCAD’s Director of Northern European operations, Nick Blamey
The business problem
Time-to-market and the associated benefits of providing rapid feedback to developers are a key part of any DevOps strategy. Standard approaches of catching defects which occur at the UI layer – while important – still fail to eliminate a number of key challenges organisations face in particular in the following situations:
- Where significant and mission critical application functionality resides in complex Batch processes
- Where a number of legacy applications are involved e.g. on the IBM i / Mainframe which tends to result in a multi-speed development approach and complex defect triage effort between new application development team and the legacy core teams
- When isolating any defects directly within the Database and understanding the impact of changes to application program have a direct effect on the Database level functionality
- Where the demand for obfuscation and anonymisation of Test Data are part of the compliance process
Most organisations are embracing End to End UI Testing automation using a combination of the following solutions:
- HCL OneTest
- Tricentis TOSCA
- Worksoft Certify
- Rational Functional Tester
- Mobile Application Testing Tools e.g. Eggplant, Perfecto Mobile etc.
Each Test Automation solution has advantages and unique capabilities to support different Testing Approaches but they all lack the ability to test, locate (triage) and fix complex legacy application errors buried deep in the legacy applications themselves on the IBM i (AS/400).
A different paradigm for testing and locating defects with Legacy Systems
We can analyse the 3 main challenges and potential bottlenecks to the process which cannot be eliminated by these standard tools as follows:
- Batch Testing: Batch programs tend to be large scale mission critical application functionality embedded in batch processes combined with…
- Complex Database and IBM i Spool file defects buried in the Database layer which results in exceptionally long debug cycles for developers to find and locate defects..
- Test Data anonymisation challenges including continue Test Environment refresh to cope with GDPR and other compliance requirements
From ARCAD’s experience with some of the largest Legacy application estates, these challenges are becoming more and more complex to solve. The rest of this document will explain both why these challenges have a potential to limit the speed of your DevOps process but also how you can solve these challenges using ARCAD solutions.
Why are Batch Testing, Database Level Testing and Defect Triage each a major potential challenge and potential bottleneck to your development process?
- There are many solutions to solve these problems but each one involves manual coding of a bespoke solution to fix e.g. Batch Testing with the inherent risk of maintenance of “roll your own software” in terms of both the knowledge required to create your own Batch Testing solutions but also the business risk of ongoing maintenance
- Debugging any problems around Database level defects through the User Interface tends to burn massive development effort with the “triage of defects” embedded in multiple components being one of the key limiters of achievement of DevOps aspirations
- Message Queue style Spool file defects tend to cause some of the most serious defects in Legacy IBM i applications which further exacerbate the problems.
The goal of the remainder of this article is to explain in detail both the options at your organisation’s disposal to resolve these problems, and the solutions available from ARCAD proven by multiple software development teams, typically developing on a combination of IBM i legacy applications and the front-end apps leveraging the business logic contained within the legacy.
“Unpicking the Spaghetti of Batch processes”: Common Defects from Batch are easily diagnosable in the Spools and Database
The most common defects created by Batch programs tend to be created by simple mistakes in the configuration of Spool and Database commands. However simple it is to “fall foul” of these simple but potentially “showstopping” defects, the effort most teams tend to expend in trying to reproduce them and fix them can massively limit the ability for a development team to make risk-free changes to applications where Batch, Spool and database edits form a crucial part of their functionality.
Typical Testing Challenges with Spool, Database and Batch can be classified as follows:
- Requirement to hand-script a batch execution, and then further to write complex database code to verify that on completion of the batch processes, the correct tables have been updated accurately in the test database.
- The need to confirm that application functionality remains the same when changes are made to the code which interacts with Spool and Database components. Programmers need to be able to understand, debug, fix and re-test complex functionality which touches these components. A typical example would be where the developer needs to check that a large table of data is “read in its entirety”. If this functionality contained a defect then an application could produce a serious defect if the range of data was limited to a subset of the data which can cause both application functionality defects but also impact the performance of the application.
- Creation of an application specific testing framework with the inherent challenges of: hand coding error reporting functionality, test data refresh capabilities with the associated risk that the functionality of the Test Harness / Framework becomes more complex than the application code itself.
- The creation of error handling functionality which runs the risk of either displaying too many additional errors (some of which can result in errors in the Test Framework / Harness itself) and then creating even more work for developers in fixing them: defect overload but even worse the risk that serious errors are then deemed to be part of the Test Harness which are then ignored.
- Finally, the developer spending a great deal of time creating end-to-end tests with the associated documentation when a simple “recording” of how the users “actually” use the application would uncover a true and up-to-date view of the end-to-end tests which real users actually perform during their day to day usage of the application in production.
- Further the ability to perform Test Coverage metrics for Compliance reasons (to show that all the application functionality which has changed in the latest iteration) has been fully tested and that 100% code change coverage has been achieved. The ability to exploit the functionality of the Rational Developer for i (RDi) IDE will become more and more important as the DevOps cycle accelerates and more and more application versions are pushed into production.
These challenges above can all be handled seamlessly and most importantly with zero programming effort using the ARCAD-Verifier Test Automation solution, described below.
DevOps requires Build Verification Testing (BVT) / Smoke Testing
To ensure that a “testable” application is passed to the QA team from development in rapid iterations many organisations create a suite of Build Verification Tests (BVT). This process is key to ensuring that defect tracking systems are not filled with “test environment unavailable” or “incorrect data in Test Environment” defects.
Smoke testing covers most of the major functions of the software but none of them in depth. The result of this test is used to decide whether to proceed with further testing. If the smoke test passes, go ahead with further testing. If it fails, halt further tests and ask for a new build with the required fixes. If an application is badly broken, detailed testing might be a waste of time and effort.
The risk to any organisation of NOT having a BVT / Smoke Testing process is a large number of additional defects being created, each of which need to be analysed, triaged and fixed even when the defects are actually NOT software problems but rather Build Verification problems.
The effect of lack of Build Verification Testing is that a typical defect classification Report Tends to show many defects classified as “cannot reproduce error” or “test environment issue” or “test data issue” thereby slowing down development time but also distorting the ability for Software development organisations and application stakeholders to make rational decisions about the severity of particular defects.
Problems Associated with Batch Processing
The biggest challenges facing IBM i developers is that a “batch” workload by definition interacts with and impacts many components / programs and is often critical functionality that runs “headless” i.e. in an unconstrained mode.
A batch job will always complete its workload in the shortest possible “low application usage” window to eliminate the impact on application performance. The challenge this creates for IBM i developers is that a batch process attempts to override any other workload running simultaneously because it will always prioritize execution speed over any resource balancing on the hardware it is running on. The only time a Batch process will lose priority is when its execution is limited by a system bottleneck i.e. normally the smallest Data Flow point in the application.
Because of the two problems detailed above, many developers look to perform scheduled batch jobs out of hours which can limit the functionality of the application whilst users wait for the “overnight batch jobs” to complete before performing their work.
Modern DevOps development teams on the IBM i have to constantly manage the following challenge: the need for batch jobs to execute quickly due to their “mission criticality” of their work and the reliance of other application functionality on their completion.
Batch testing: make sure that your batch performance does not cause defects downstream by non-completion
Whilst many Batch applications on the legacy were written many years ago, by definition they often contain mission critical functionality constituting a major part of your application’s value. For these reasons the continual enhancement of batch functionality presents a number of challenges:
- There is a need for a DevOps solution to continually maintain the batch processes and associated Test Data and Test Environments.
- Batch processes tend to require a deep understanding of the architecture of the entire application.
- Batch processes tend to run in parallel meaning a failure in one batch process can result in exceptionally complex defect triage and fixes.
- Batch processes can cause disruption to the performance and functionality of an application if they run incorrectly.
For this reason, many organisations attempt to create a totally seamless process for Batch Process change/commit and then attempt to make these changes available to the Test LPAR. The effort involved in trying to perform Batch testing can then become punitive with the maintenance of the batch test process over-consuming development team resources and causing project delays.
Why direct your testing efforts onto Batch and Database level testing?
- Defects in Batch and database are the most difficult to find, most costly and most risky to transfer into live production without end-to-end testing
- Batch processes by definition are high-risk, typically a culmination of a number of transactions each creating value and each with a high potential impact. Manual rework of Batch processes in production tends to be the most difficult and expensive to fix.
How severe can Batch defects become?
Because of the complexity of testing Batch and their specific data and test environment requirements, batch failures have led to some of the most publicized incidents ever witnessed. A particularly high-profile incident was the prolonged downtime at RBS in 2015 – and a similar episode in 2012 – caused by batch process failures at the bank impacting hundreds of thousands of customers.
Batch defects also tend to be neglected by many “end-to-end” test processes, but due to their mission criticality, their requirement to run and complete in a specific application usage window (normally when an application usage is more limited) and potential impact on the usability and performance of the application, certain defects in batch functions can “slip through the net” and reach production with disastrous consequences.
A particular concern with batch testing is Performance. If a batch process runs daily on close of business and application response time suffers, then batch performance is a factor to consider when creating a risk-based testing process.
Testing a multi-tier application with Dependencies between legacy IBM i and new “front-end” components: what are your options?
The table below details the options available and benefits and pitfalls of each option, based on ARCAD customer feedback.