Saturday, January 30, 2010
Key Elements of Automated Tests
That begs the question: What makes up a good test? What key characteristics should a good test have?
Setup, Execute, Validate
The first key ingredient is that a test consists of three parts: The first parts sets up the data that is needed for the test. This could be restoring a particular database content, it can be setting up a few objects in your programming language, it can be launching a particular user interface (e.g. browser) and many more. The second part is the actual execution of the test. This means you invoke functionality that modifies data. In the final third party you validate whether you have the expected outcome, e.g. the actual data is equal to the expected one.
Occasionally I've found, though, that people forget about the third step. I don't have data but suspect that this happens when people come from a background where executing a piece of code without crashing is almost a success. Think early/mids 90s of last century. C and C++ were still very dominant in the PC industry. Exceptions in the midst of running a program were nothing completely out of the ordinary. (Maybe you are the only one who never experienced them?) However, we can do better. Just because it doesn't crash with a nasty null pointer exception doesn't mean it performs as expected. Therefore at the end of a test always validate the outcome! The typical tool for that are the various assertions that come as part of test tools.
Repeatable
Not strictly a requirement but there are quite a few scenarios where running the same test more than once reveals and thereafter prevents certain bootstrapping type issue from happening. Assume your service implementation does some sort of housekeeping upon startup. The first time you invoke an operation on the service everything is still fine. But then perhaps as you repeat the same test (or set of tests) using operations on that service things go off track. Maybe connections are not properly closed. Maybe the service cannot handle more than 10 open connections at a time (rightly or wrongly). By repeating the same test over and over again chances increase are that discover a hidden issue and resolve it before your product is shipped.
Random Order
Tests should not depend on each other. A test should not require a different test to run first. If they do changes to one test may trigger further changes to other tests in the suite thus making changes more expensive and time consuming. You don't want to lose time. You want to be fast.
For example, lets assume you are working on a system that has Project as a concept and the name of the project becomes a unique identifier for each project. If all tests use the same project name for their tests, then each test would have to check during setup whether the project already exists. If it doesn't it would create it. The alternative would be to use a generated name in each test such as a string with the value "ProjectName" + RandomNumberAsString(). That way you make the tests independent from each other.
A corollary to this is that you can run each test in isolation, meaning you can run just that test focusing on the specific task at hand. You don't have to run other tests first and you don't have to remember the sequence for those other tests. You can - and probably want - to run the entire suite anyways once you have finished the coding part of your task.
Fast Equals Cheap
Why do tests to be fast? To an engineer time is money. The longer a test or test suite needs to execute the less likely it is that people will run the suite or portions of it. It a suite runs takes 1 minute to complete you probably run it before each commit. If a suite takes 5 hours you won't run it before each commit. So keep them fast, for example work with the smallest possible dataset, avoid or mock costly operations like talking to remote systems, databases, filesystems, anything that requires mechanical parts to move. Use in-memory databases (e.g. SQLite) instead of client-server systems such as Microsoft SQL Server or Oracle.
You may also want to continuously refactor your tests as well. Keep them lean and mean. Split the test suites into the ones you definitely want to run each time before you commit and those that are more expensive in terms of duration. Platform tests or scalability tests fall into the latter category.
Automated And Integrated
Tests have a value only when they are executed. As long as they are just sitting in your version control system they are useless. Make them work. Make them work hard. Use all those machines that sit idle while the developers using them during daytime are at home enjoying live. Integrate the execution of your automated test suites into your development processes. When tests and their execution are automated and integrated they are executed more frequently and each time a change in the code base has been committed.
Are Automated Tests Orphans In Your Team?
Automated tests are first class citizens and equally valuable as the product that you ship. Don't even think for a second that they are just an unavoidable side affect. It's not a tax report that you do because law says so. Instead fully integrated automated testing is the mechanism that allows your team to operate at full speed. Depending on your team size maintaining the testing infrastructure can well turn into a full-time job for a motivated engineer.
Treat your testing infrastructure at least as well as the other parts of your development infrastructure. Make all of it part of an end-to-end integrated development process that starts with a customer suggestion and ends with a deployed system operational for the very same customer. Automated testing is a key ingredient and the rules are simple to follow. No excuse any more for not improving software quality. Happy testing!
Saturday, May 03, 2008
Re: Musings on Software Testing
While I share a lot of the concerns that he mentions and also have seen a few of them materialize in practice, I still get the sense that something is not quite right in Wes' post.
Reading a book on a subject that heavily depends on practical experience doesn't really give you the full experience. I'm sure he'd agree with this.
Overall the post comes across as a mostly theoretical discussion with little practical background, at least on the commercial scale or long-term application of TDD. This surprises me a bit since Wes - at least in 2005 - was a developer on Microsoft's C# compiler team.
I would love to know more about the background and context, e.g. empirical data, practical experience from commercial projects, etc.
To make a specific point: He mentions that the testing ideal is to minimize the cost of bugs. Well, that is certainly a good objective. For TDD, however, there are additional aspects that are important, e.g. trying to find a simpler implementation of the code through refactoring that becomes only available because of the comprehensive test suite that TDD creates in the first place.
I also think that the first diagram in Wes' post is not quite accurate. For instance while refactoring you also run tests to see whether or not your refactoring broke any tests. So you'd go from step 5 to step 2 or 4.
Looking at TDD in isolation doesn't make the cut either in my experience. TDD makes most sense and provides most value if it is one element in a system of interdependent and interrelated elements that comprise an agile development approach. So for instance there are interdependencies to refactoring, pair programming, and others. The techniques of XP are not just a laundry list of best practices. They support and strengthen each other.
I have been using XP (including TDD) in various projects of different sizes since 1999 when I was introduced to TDD/XP by Kent Beck. I am currently managing a 40+ people commercial software product project for an international audience. One of the key elements is TDD. The results that my teams have produced in this time have been by far superior to anything I have seen developed with a more "traditional" approach (this is certainly limited to the projects I have sufficient information about).
Bottom line: While I like Wes' post very much since it highlights a number of good points and concerns, at the same time it seems to lack quite some credibility because little empirical information is provided that support at least some of his statements. The post reads to quite some degree more like a theoretical opinion lacking sufficient practical background. Again, surprising given his (past) role at Microsoft.
One of my past managers liked to put it this way: Without numbers you are just a person with another opinion.
But, hey, maybe that's exactly what his post is: An opinion. And in that sense: Yes, I like it.
Monday, February 04, 2008
Software Quality and Tools
23.8 million hits represents a large number. Apparently there is no shortage of information and tools for improving the software quality. Why is it that we still have significant bugs in many software products (recent examples: here, here, and here)? Why are still many customers dissatisfied with the quality of software (indicators are mentioned here, here, and here)? Well, sometimes customeres might be satisfied because they have such low expectations which in turn were cause by poor quality in the past.
So, how to address this problem? I don't know the complete solution. I do know, though, that the longest journey starts with the first step. And here are two suggestions for what those first couple of steps might be for your organization. Introduce the following two rules:
- Continuous automated build. Make sure your software system, component, etc. is built multiple times a day. Automatically. With feedback to all developers working on the code base. And if the build is doesn't pass (it's broken) then lock down the source control system until the build passes again. This might be quite disruptive at the beginning. But imagine what behavioral changes this rule might cause? For one you will notice that your people will start to be much more adament about the quality of their change set. Who wants to be the reason for a broken build? And then it also triggers off a continuous thinking about how to improve the toolset, the code, the tests, etc. so that it becomes easier for every engineer to produce quality code.
- One automated per bug. Let's assume you are not happy with the number of bugs in your software. What if for every single bug your engineers fix (at least) one automated test has to be added to the automated test suite? What if that automated test reproduces the bug and when the bug is fixed the test passes?
This rule makes most sense if each automated test is added to a test suite that is run as part of the automated build (see rule 1).
With the above rules you start building a comprehensive set of automated tests. Some may say, we have always been doing that. That might be correct, but my experience tells me that some organizations simply run the suite of tests only when they test the entire system, after the "development phase" is complete and just before the new version is shipped.
Also in some cases people claim that rule 2 cannot be used with legacy systems because writing such tests is too expensive. Again, that might be correct. If that is the case it would be an additional indicator for the (lack of) design quality. The system is too hard to test. Enforcing rule 2 will help to enforce a refactoring or redesigning of the system towards better testability. Also, lacking organizational discipline (or time) - bugs are simply fixed without writing an automated test and then simply shipped after a quick manual assessment. This is far from what is ideal.
By adding one automated test at a time to your automated build - now including an automated test portion - your system will increase in quality over time. A single test won't make a difference. But as your test suite increases in size it will cover more and more of your system.
And here is another observation: When a bug is reported which areas do they tend to be in? Typically you'll find them in frequently used areas or in areas that are particularly flaky. By adding automated tests in these arease you actually target those areas where you get the biggest bang for the buck.
Note that the two rules can be used for both new development and legacy code bases. There is no excuse for not even trying to improve the quality. It doesn't require myriads of expensive and complex tools. Simple rules like the above can help improving organizational behavior towards better (software) quality.
Saturday, August 05, 2006
Customers Writing Executable Specifications
Ideally, requirements would be written in a way that makes it easy for the development team to verify whether the system satisfies them. So why not making requirements executable? That way your team members can run them as often as needed, and they could stop immediately once the system passes those tests.
The way to go forward is therefore executable specifications, or story tests. The most prominent thought leader on this is probably Rick Mugridge, on whose web site you can also find further information.
One tool to create and maintain such customer tests is Fitnesse. It is available for many different languages.
The benefits are very compelling. You have a much tighter link between your customer (might also be a product manager) and your development team. Executable specifications are a means for improving communication. You also get a tool that reduces the gap between what your customer wants the system to do, and what the system really does.
From my experience, it's worth playing with this concept. It might turn out to be an excellent addition to your toolbox for agile project management.