Tuesday, March 16, 2010

Improving Estimates

An important capability of an agile team - in fact of any software development team - is estimating. Certainly you can break down any body of work into tiny bits, then estimate each tiny bit and total up the numbers to get an overall estimate of the entire body of work. But how small should you break it down?

When you use stories - or you may call them tasks or something similar - you can use a story as the unit which needs to get estimated. Initially it will be hard on the team. For example how would you estimate the size of a story if the estimators have different roles such as developer, business analyst, tester, user interface expert, performance engineer, etc. How can a developer assess what amount of effort is required by the user interface expert or the tester?

In reality they don't need to. With the initial set of stories all you need to do is agreeing on some relative sizing. These sizing will be in all likelihood completely off anyways. That fact of life should make this first step more relaxed. After the iteration is complete you look at how much you completed. Let's say you use NUTS as the unit for relative size. (NUTS = nebulous units of time, credits go to Darren Rowley from whom I learned this one.) Then you can look at the completed stories at the end of the iteration and see whether the initial estimate was correct. Was story 'xyz' really twice the size of story 'abc'? It doesn't have to be scientifically perfect. All that matters is that you give it your best shot and record the actuals.

By recording the initial estimates and the actuals you are already on your way to improving your estimates. Please keep in mind that generally estimates are provided by a cross-functional team rather than by individuals. And ideally the estimates are provided by the team that will do the work eventually.

By default you should sign up for stories that don't fit an iteration. If they are too large, break them into smaller pieces. At times, however, it can happen that a story is not complete at the end of the iteration and at the beginning of the next. One example could be that some capacity was left over towards the end of the iteration and work on an additional story was started.

If a story is incomplete at the end of the iteration (for whatever reason!) then the team should assess whether the size of the story is still good or whether it needs to be updated (either way!). If the estimate is changed then you should record the updated estimate as well. Why? The only reason to record the updated estimate is to allow for a proper capacity planning in the new iteration. You need to know the updated/current estimate and how much is left, so that the team doesn't over-commit but signs up to only as much work as they think they can accomplish.

So in effect, you are recording three numbers: The initial estimate, the updated estimate (history of this is not required), and the actual figure once the story is complete. The comparison of initial estimate and actual number allows you to measure how - as a team - you become better at estimating. The updated estimate is important for understanding how much work your team signed up for in a particular iteration.

If you like you can use a simple spreadsheet for recording these numbers. Make sure you add some dates for further analysis, e.g. how was the quality of estimates in quarter one compared to quarter two? A team is getting good at estimates if you use a mix of about 10 to 20 stories and the delta between the total of initial estimates and the total of actuals is less then 10%. I have worked with teams that got this figure to less than 5%, which is excellent. Always keep in mind that we are talking about estimates - we are trying to look into the future - and not about prediction.

A word of caution: Recording these numbers is important as a tool for the team. Stay away from the temptation of using it to measure the individual performance of a team member. Even if unintentional, as soon as any team member even just perceives it as a performance control mechanism, tracking the estimates and actuals is dead. The team must be able to update estimates without fear whether the fear is induced deliberately or accidentally.

Working with this tool - recording the estimates and actuals - and experimenting with it can lead to an extremely powerful tool. It doesn't increase the capacity of the team but it definitely will lead to much improved estimates and to better predictability of the deliverables of the team. And that in turn will lead to higher customer satisfaction. The spreadsheet I mentioned should not drive the team meeting. Instead it should just reflect the outcome for future reference. The team builds their body of completed stories that can serve as reference points for new stories for which a relative size is required.

Therefore: Experiment with this tool until it works for you. If it doesn't feel right chances are it is not working yet! Be courageous, try something, don't be disappointed if it doesn't work, try something else, then improve.

Good luck and have fun!

Friday, February 12, 2010

What to test in a web application?

Sometimes I'm asked what to test, in particular when I explain that testing the happy day scenarios is not sufficient. For example in web applications I'd certainly expect that everything that has a link to a different page actually brings you to that other page.

Another example would be any kind of control for entering data, e.g. text boxes, radio buttons, drop down lists, check boxes, and more. Let's take a text box for a product number. The valid range might be a positive number that has 6 digits. In that case you would also want to test whether you can enter less or more digits. The system should have a defined behavior. Then try entering spaces, e.g. 2 digits, then a space, then 3 more digits. Test whether you can enter nothing. Test what happens if you enter a mix of digits and characters. I'm sure you can think of more depending on the system you are working on.

One less obvious case are routes. A route allows you to enter a link that the system can interpret and translate in a specific way into a url. Routes allow certain items to be bookmarked. For example: you may want to support a url such as "http://nirvana.org/Product/246337/View" (of course your domain name would be different). The concept here is that you have the domain class name first ("Product"), the specific instance id next ("246337") and the method ("View") last. In essence a route is then: "http://nirvana.org/Product/{productId}/View". Depending on the technology you use to implement this route, somewhere you will have to extract the product id and create a url to the page that can handle this request.

The point I want to make is this: A route like this needs to be treated like a method. In essence it is similar to a method and hence there are quite a few test cases. Some examples of test you should consider:
  • No product id: "http://nirvana.org/Product//View"
  • Non-numeric product id: "http://nirvana.org/Product/foo/View"
  • Negative product id: "http://nirvana.org/Product/-123456/View"
  • Product id too short: "http://nirvana.org/Product/12345/View"
  • Product id too long: "http://nirvana.org/Product/1234567/View"
And this is just a selection for this very basic example. I'm sure you can think of more tests. The point is, sometimes things like this are easily overlooked and as a result your system may contain defects that you are not aware of. In the case of a web application it means, that if you allow people to bookmark certain pages, be aware that people not only can but will enter invalid URLs! Be prepared!

Saturday, January 30, 2010

Key Elements of Automated Tests

This time I'm writing about an item that is admittedly very specific to software development. More than once when I spoke to a members of a development team I was told "yes, we have an automated test suite". And yet, further along the conversation it turned out that despite a significant test suite the resulting quality wasn't where all the efforts put into creating those tests indicated it should be. And in all these cases when we then took a closer look at the tests themselves it turned out that at least one key element was missing.

That begs the question: What makes up a good test? What key characteristics should a good test have?

Setup, Execute, Validate

The first key ingredient is that a test consists of three parts: The first parts sets up the data that is needed for the test. This could be restoring a particular database content, it can be setting up a few objects in your programming language, it can be launching a particular user interface (e.g. browser) and many more. The second part is the actual execution of the test. This means you invoke functionality that modifies data. In the final third party you validate whether you have the expected outcome, e.g. the actual data is equal to the expected one.

Occasionally I've found, though, that people forget about the third step. I don't have data but suspect that this happens when people come from a background where executing a piece of code without crashing is almost a success. Think early/mids 90s of last century. C and C++ were still very dominant in the PC industry. Exceptions in the midst of running a program were nothing completely out of the ordinary. (Maybe you are the only one who never experienced them?) However, we can do better. Just because it doesn't crash with a nasty null pointer exception doesn't mean it performs as expected. Therefore at the end of a test always validate the outcome! The typical tool for that are the various assertions that come as part of test tools.


Not strictly a requirement but there are quite a few scenarios where running the same test more than once reveals and thereafter prevents certain bootstrapping type issue from happening. Assume your service implementation does some sort of housekeeping upon startup. The first time you invoke an operation on the service everything is still fine. But then perhaps as you repeat the same test (or set of tests) using operations on that service things go off track. Maybe connections are not properly closed. Maybe the service cannot handle more than 10 open connections at a time (rightly or wrongly). By repeating the same test over and over again chances increase are that discover a hidden issue and resolve it before your product is shipped.

Random Order

Tests should not depend on each other. A test should not require a different test to run first. If they do changes to one test may trigger further changes to other tests in the suite thus making changes more expensive and time consuming. You don't want to lose time. You want to be fast.

For example, lets assume you are working on a system that has Project as a concept and the name of the project becomes a unique identifier for each project. If all tests use the same project name for their tests, then each test would have to check during setup whether the project already exists. If it doesn't it would create it. The alternative would be to use a generated name in each test such as a string with the value "ProjectName" + RandomNumberAsString(). That way you make the tests independent from each other.

A corollary to this is that you can run each test in isolation, meaning you can run just that test focusing on the specific task at hand. You don't have to run other tests first and you don't have to remember the sequence for those other tests. You can - and probably want - to run the entire suite anyways once you have finished the coding part of your task.

Fast Equals Cheap

Why do tests to be fast? To an engineer time is money. The longer a test or test suite needs to execute the less likely it is that people will run the suite or portions of it. It a suite runs takes 1 minute to complete you probably run it before each commit. If a suite takes 5 hours you won't run it before each commit. So keep them fast, for example work with the smallest possible dataset, avoid or mock costly operations like talking to remote systems, databases, filesystems, anything that requires mechanical parts to move. Use in-memory databases (e.g. SQLite) instead of client-server systems such as Microsoft SQL Server or Oracle.

You may also want to continuously refactor your tests as well. Keep them lean and mean. Split the test suites into the ones you definitely want to run each time before you commit and those that are more expensive in terms of duration. Platform tests or scalability tests fall into the latter category.

Automated And Integrated

Tests have a value only when they are executed. As long as they are just sitting in your version control system they are useless. Make them work. Make them work hard. Use all those machines that sit idle while the developers using them during daytime are at home enjoying live. Integrate the execution of your automated test suites into your development processes. When tests and their execution are automated and integrated they are executed more frequently and each time a change in the code base has been committed.

Are Automated Tests Orphans In Your Team?

Automated tests are first class citizens and equally valuable as the product that you ship. Don't even think for a second that they are just an unavoidable side affect. It's not a tax report that you do because law says so. Instead fully integrated automated testing is the mechanism that allows your team to operate at full speed. Depending on your team size maintaining the testing infrastructure can well turn into a full-time job for a motivated engineer.

Treat your testing infrastructure at least as well as the other parts of your development infrastructure. Make all of it part of an end-to-end integrated development process that starts with a customer suggestion and ends with a deployed system operational for the very same customer. Automated testing is a key ingredient and the rules are simple to follow. No excuse any more for not improving software quality. Happy testing!
Hostgator promo code