In Pursuit Of the Elusive

Friday, January 11, 2008

Automating Tests: A Suggested Classification

My last Post had described about the our teams experiences in implementing an automated build verification tests. There's been lot of talk and blog post over pros and cons of automated execution of test scripts during past couple of weeks. This post Post in James Bach's blog and Steve Rowe's this post are indicative of where the discussions are heading.

I've found automated execution of test as a significant productivity booster for testers. It relieves testers from the drudgery of re-running tests that were already executed and found to pass.

The current project I'm working on is first version of a business intelligence suite. Being the first version, there's quite of lot of GUI redesign and this is a challenge for development of automated test. However we've been successful in getting the initial level of test automated and successfully executed on daily images vericiations.

We've been fortunate in our project to learn from the mistakes made from other project groups as far as automation approach is concerned.

We've taken the approach for classification of tests to be automated under the following levels. Level-0 or build verification tests, Level-1 or happy path tests, Level-2 or error conditions, Level-3 or detailed user scenarios Level-4 advanced automation strategies.

Level-0 tests are used to check the testability of the product. This is preferably executed on your daily build before this is released for testing. The level-0 tests are often the first to get automated. These tests are also known as smoke tests.

Level-1 tests are used to test the positive paths of the application, mostly focusing your test goals on testing the application features from the user interface. For example, in an banking application, test for correctness of balance', 'check for transfer from one account to another' 'check for clearing of check'. These tests check for your correctness of your GUI and check for the integration of the GUI with the overall system.

Level-2 These will be automated tests for checking the error conditions or conditions that do not occur on normal working of the application. In case of the banking application, this may be be something like 'test for check bouncing', 'test for incorrect signature matches' etc.

Level-3 test are the detailed user cases. For example, create a scenario where a customer opens an accounts and does couple of deposits and withdrawal and closes the account. A suite of these tests can also be used as user acceptance tests.

Level-4 These are tests that typically cannot be carried out without the aid of an tool. High Volume Test Automation is a good example of this type of tests. For a banking application, simulating a typical year end closing procedure or executing hundreds of concurrent accounting transactions.

These classification of test needs to be looked at withing the context of the project in which you are testing. The levels are not really a classification of the order in which the automation project should be implemented. You may very well have tests automated from level-0 and level-2 , skip tests in level 1, level3 and automate tests in level-4

Saturday, December 29, 2007

Lessons Learned Implementing DBVTE

"DBVTE" is an acronym I'd made up about 2 minutes ago when I decided to blog this. It stands for Daily Build Verification Test Executions.

The Background:

My current project has gained much notoriety among testing team for the poor quality builds delivered for testing. Early in the project iterations, the issue was there was no testing build at all. Later on after much delay the daily build process was finally got booted up. But then it came up with its on set of issues. Lack of a coherent integration test ensured the daily builds never work as a single unit. It'd be a miracle if build picked on any particular day will be of minimum quality for testers to start the tests. The situation demanded a consistent way of delivering testable builds on a periodic basis.

The Approach:

The way to go here was for the functional testing team to take-up the integration testing activity. We'd started with documenting the bare minimum use case & scenarios of the product in sufficient detail that a novice computer user could read the document and execute those scenarios. These were written mostly from a GUI perspective like "Step 7. click OK button", "Step 11. Right Click, select Edit From Menu", "Step 23. Verify Graph". These are also called smoke tests or sanity tests.

The important activity here was to identify a dedicated machine for the integration testing purpose. Its very important to have a consistent, reproducible and stable environment for the integration testing machine.

Now that the nightly builds were getting build, we'd developed a series of simple windows batch scripts. The purpose of these scripts were to download the build files ( java jar files, codes developed in SAS scripts, configuration files...) from build machine. The scripts did administrative activities such as stopping the application services, restarting services etc as well lay down the build files like the installer would have performed. The purpose of the script was to eliminate human error in the installation process and to save the time in the install process. A typical install process of the application we test will require 3 hours of install & configuration time. The automated install scripts does this in 10 minutes flat!.

We'd chosen the "SAFS" as our automation tool for automating the smoke tests. Although its not mandatory to automate your smoke tests, we'd taken the approach to automate primarily to ease the drudgery of executing the tests by a human tester. SAFS is an excellent tool and it very neatly took care of incremental changes to the application. Moreover SAFS is developed and supported within our company.

The Process:

Now that we've had the tools in place, its time to work on the process around it. The most important part of any process is to get the buy in from the stakeholders. The good part about it was the development and project mangers were already aware of the problem with integration. Just that they never had given enough thoughts about how to tackle the problem. Our case is that we've the tools in place for integration tests. A quick demo of the automated install scripts and the automated smoke test was a very well accepted. Now was the case to present the case for the process and the development teams commitment for the process.

The process was simple; testing team will download daily builds from the build machine, execute the automated scripts and send in the results of executions to the entire team. In case of defects, encountered in the automated scripts, these will be logged and will be part of the execution results. The development team on its part, should fix the defects immediately or back out the changes in the next build.

The Lessons:

Developing Automation scripts of any GUI tests is hard. We were baffled bySAFS automation tool limitations for which workaround were found. Our GUI was developed on Eclipse framework and support for it was limited in SAFS.
Development of Automation scripts for GUI is incremental. Develop scripts with smaller features, test, stabilize and then operationally.
Concentrate on the breadth of the functionality, not the on the depth while developing the smoke tests. Check all basic features of the application rather than the complex features.
Automate the install and configuration process of daily builds. This will avoid human error in the installation as well as speedup the installation process.
Start with a clean machine for installing the first build on the integration machine. In case of windows, format and reinstall the OS. On Unix, this process is expensive.
Cherish your process. Follow your integration process religiously and advocate it within your entire team.

Sunday, December 16, 2007

Search Power @ Google Reader

This is one feature everyone using Google Reader all longed for from Google. Its come pretty late as a feature at GoogleReader. Finally Yes, its here and makes life easier searching for blog posts.
I'd read this list "How to Get Things Done - Colin Powell Version" and wanted to share it in my Shared Items. All it took me is a few seconds search on Colin Powell and ther it was.

Sunday, November 11, 2007

What's your Top Five?

I've found the 'Top Five" approach in defect disposition quite effective. Here's how its done.

Q) What's the "Top Five" approach?

A) Simple, create a the list of top five defects you'd need urgent fixes for.

Q) Is it different from Test Holdup Defects?

A) Well, to some degree it is and to some degree it is not. Holdup defects are the ones where your testing activity is held up. This needs urgent fix, so naturally the holdup defects should be part of your top five defect list. However you should have defects that aren't holding up your testing.

Q) Why only five defects in the list?
A) From our experience we've found that anything more than 5 defects in your top five list becomes unmanageable for development and test leads and thereby this list loses it effectiveness. The idea is to have only so much defects in the list that the test and development leads can memorize.

Q) How to manage this list?
A) You'd require a solid development team backing for this approach. Speak with your development lead and manager before you embark upon this step. Ask for the commitment from their team that the defects in the TopFive will get their utmost attention and will be dispositioned fast.

Q) Are there tools to manage the list?
A) Reports from your defect system should be sufficient. If your defect system supports tagging of defects, this is the best way of doing it. Tag your TopFive defects and create a report. Simple excel sheet or an email message with predetermined format will also work.

Q) What types of defects should not be in this list?
A) Defects that you know will take longer time to get dispositioned should not be in this list. We've defined a 'longer time' as over two weeks. Since you know these defects will take longer, do not keep these defects in this list as these will take up a permanent slot in your Top Five defect list.

Q) What type of projects are TopFive list effective?
A) Typically medium to large project with current open defect count of over 150 defect count. For smaller projects, may not require the overhead of maintaing this list.

Q) When do you create a TopFive list?
A) We typically create a list of 5 or less defects every week ( say monday) and communicate this to list to the development manager.

Q) What's the turnaround time for TopFive defects?
A) Typically, we've found to have one week time for these defects. i.e. When you enter a defect in this list it takes typically one week to get fixed.

Q) What's the followup process?
A) Typically we do a status check with development manager every 3 to 4 day and disucss the list in our weekly meetings.

Wednesday, September 19, 2007

Looking at afterthoughts

With customer reported defects in their inbox, tester have this sense of "if only we could have done this too", "why did we miss that?", "i though we'd tested that". Its a mad scramble to find an explanation when things get uglyOften at the release time, the opposite is true. The feeling is "I know i did;nt get to that, but it not important", "I know its not tested to my satisfaction, but its too late to holdup a release now, should have raised this earlier", "the test data is not comprehensive, customer data may uncover more issues"Looking at hind-side, the the emotive and tense aspects are overlooked. With the count down to release time ticking, the challenge is to get the risks covered. Testers often overlook minor aspects of the product Being in the lowest end of the "software food chain", its the testing time that most of the other entities in the food chain eatup. Again and again, we hear this around our cubicles and meeting room, "its taking more than expected to complete this feature, but its critical". The routine activity done under these circumstances are "lets take up few more days", and the "few more days" come at the cost of testing time. Of course the delay in project is not affordable since we cant miss the market!.In these days of agile & iterative development, a lot is being discussed about the importance of testing early and getting testers involved early in the development process. Better said than done? May be, maybe not!
The projects I've worked on have for past few years have a structure with a development manager looking at development activities, a test manager on the testing activities and a project manager managing the project timelines and external liason required by the project. The project manger often holds the key for success of the project and the effective delivery of the project depends on the effectiveness of the project manager. Project manager do not have anyone reporting to him/her. This person's primary responsibility is to maintain the timelimes, make sure everyone in the project adheres to the timelines, look out for possible risks and escalate issues. Often with the prior experience in development teams, the project managers unfortunately champions the cause of development team often at the expense of testing team. The schedules prepared by project manager puts overlay emphesis on development delivery and ignores the fact that testing resources can be optimally in an iterative developmment mode. The schedule prepared by the project manager turns out to be a waterfall method that leaves little time for the testing team.
Of course there are project managers who considers themselves as the independent authority and creates the project plan with optimal utilization of all the resources in the project. A project manager stern on the timelines are the ones with successful project delivery.

Sunday, September 9, 2007

Due share for Test Management?

Over the past several months I've been keeping myself updated with blog activities on software testing. Yahoo group on software testing has also been very interesting read.
Overall, majority of the blogs discuss about the testing processes & techniques, what I'd term as the "last mile" in the software testing. Certainly this is the most important aspect of our software testing field.

Very few blogs I've been reading discuss about the "mangement" aspects of software testing. Aspects in testing such as "negotiation", "risk management", "analysis what'if scenarios", "work allocation" "work allocation" generally have very few post in the blog world, at least the one I keep track of regularly.

In general these areas comes under project management and several blogs discuss software project management. There are areas where software development project management differs from the management of testing projects. For example, there is negotiation for time and budget on both development and testing projects, however testing team generally find themselves between "hard place and a rock" in negotiation situations. Testing risks generally get better attention from higher management than risks from development team. Development teams have proven estimation techniques, testing team generally estimates on their prior experiences.

Monday, July 30, 2007

Testing for performance

Ben Simo's post on performance testing is an interesting read. These lessons learned more or less maps to my limited experience of testing application for an often overlooked aspect of functional testers.

This subject is sub categorized into various sections such as performance, scalability, load, volume testing. For the matter of simplicity I'll refer these here under the general term performance testing.

Performance is the last thing that appears in many of the requirements document I've come across. One reason for this could be that requirements are created by business analyst or product managers with limited exposure to the software development process. For them, performance is not a requirement, its "implied that the system works to the satisfaction of the end users". For the end user a non performing system is a defective system. While developing a "custom-made solution" for "a specific client" also known as consulting implementations, performance requirement can be defined specifically and these are part of the service level agreements. Here the number of users, data volume, hardware specifications are know before the system is developed. Its often difficult to specify performance requirement when it comes to development of "generic solution" that can be configured in complex ways on multiple operating platforms and implemented at clients ranging from big corporates to small and medium enterprises.
I've seen mixed reaction from developers on the performance requirement. Developers fall into many categories based on their interpretation of the "implicit" requirement. Some ignore requirement and assume their design takes care of performance. Some make implicit assumptions on the performance requirement. "My assumption is on a typical setup about 100,000 records gets uploaded to the transaction_history table by the nightly process", "I feel there may be about 10 users requesting a customer record search". Few others ask the business analyst/Product manager about the performance expectations, they may or many not get the answers they are looking for. With the inputs they get, they design the system to take care of the worst case scenario. Others realize their design design may not scale to the performance expectations but time pressures makes him to attend to more important tasks. There are of course developers who go to great lengths to ensure there code is meets any performance goals.

Testers have their own view of performance. Unless there is a mandate on performance to be tested, testers have these low in their priority. The testers asked to test for performance is functional testers with limited experience with testing for performance. In the iterative development cycles, performance can be tested only after the last iteration is delivered for testing. This leaves very less time to test for performance. Creation of sample data for performance is also a difficult proposition. Functional testers create their own data for the specific scenario or may use a 'limited' set of client data that may be easier to detect functional data integrity issues.

Performance testing needs to be considered as a specialized area of testing and not to be mixed with the area of functional testing. This again depends on the budget of the project. The 'low end' performance tests can be done by testers huddles in a single room banging away on the system with the intend that something may go wrong. This type of testing, very labor intensive may detect many of the concurrency issues. Unless there's development support for this type of effort, the defects detected as a result of this effort are marked as non-reproducible.
A cost conscious project can choose from a wide range of open source tools available for performance testing.