Usability testing – what needs fixing?Posted April 18th, 2012 by David Hamill
Usability testing is too often used innapropriately in my opinion. Compared to other research techniques, it’s fairly cheap and easy to organise. But this leads to it being used to answer research questions it often can’t answer. In this post I’ll discuss how I think the UX community is losing touch with some of the basics.
When testing a site with small numbers of participants I believe you should focus on one simple question: What needs fixing?
Testing with small numbers
In most usability studies it’s possible to recruit just a handful of participants and still get useful results. This is because the aim of usability testing is to uncover the usability issues in a design. If you were to test with lots of participants, you’d find most of the issues that you were going to find within the first 5 or 6 participants who come through the door.
The cost of adding more participants increases fairly uniformly while the benefits of adding them tail off quickly. So to get the best bang for your buck, many studies use just a handful of participants.
This thinking is very well established but I don’t think it’s very well understood in the usability/UX community and I believe this lack of understanding is getting worse.
Using small numbers of participants is acceptable because we’re hunting for usability issues. When you try to extend the purpose of your test it’s not possible to rely on such a small number of participants.
Eye tracking data
Once the magic number of 5 has been arrived at (or 6 or 8 or whatever) many people seem to conveniently forget the reasoning behind using a small sample. This can often happen with eye tracking studies. You can’t credibly quote any quantitative data from your eye tracking study with just a handful of participants.
In general, if you were to use eye tracking in a study with 50 people and then randomly split the data from those participants into 10 groups of 5, it’s likely you’d see contrasts in the data between these sets. You’d also be likely to see that no one set of data was very comparable with the total dataset. An example of this is available in PowerPoint on the Real Eyes website
If you want to draw any conclusions from eye tracking data then you’ll need to significantly increase the number of participants in the study. In nearly every usability study I’ve carried out I have concluded that eye tracking either was or would have been too much hassle to be worth including. This is not to say that I’m all down on eye tracking. I’m not, I just think it is (or at least was) over-used.
If you’re mostly interested in where people are looking on your design then carry out eye tracking with lots of participants. If you’re mostly interested in finding elements of your design that need to be fixed then carry out usability testing with a handful of participants. Eye tracking with this handful of participants can improve the experience for any observers present but it also increases the likelihood of encountering technical problems.
Perhaps it’s because of the popularity for agile development, but the practice of using usability testing as a method for testing a propositions rather than designs seems to be on the increase.
Companies attempting to create the next big thing in social media are often tempted to put their design into usability testing with the key aim of finding out what people think of it. Again if we go back to the reason we’re using just a handful of participants we can see that this practice often won’t provide very accurate results. Just because 5 people hate your proposition doesn’t mean it stinks.
Many websites are popular because they target the specific needs of very niche audiences. I once met a chap whose website taught people how to fish in World of Warcraft. It would be easy enough for me to find 5 people who played this game and carry out usability testing on his site. If I were to use this study to test the proposition of the site I might find that most of them thought it pointless. This is the opinion of just 5 people. If I were then to return to him and tell him the proposition of his site was poor how would that explain the many hundreds of thousands of unique visitors the site gets each month?
Usability testing can throw up findings regarding usefulness but opinions about it will vary even among your target audience. So usability testing isn’t a great way to establish the strength of your proposition without significantly increasing your recruitment effort
What needs fixing on this website?
When I first meet clients they often want to use testing to find the answers to numerous often complicated questions about their designs. They often focus on new features, things they think are wrong with the design or are worried don’t work. I try to convince them to approach usability testing with one question: What needs fixing? This is the one question that usability testing answers most effectively.
It’s possible to answer other very specific questions but you tend to have to sacrifice the ability to gather other findings in order to get the answer. So you go from answering the “What needs fixing?” question to answering a question that sounds a bit like this…
“When someone has this specific need in this specific circumstance, can they understand how to use the feature we’ve created to meet that need?”
Answering this question will reduce the number of unexpected things you find that you weren’t necessarily looking for. That’s not to say that usability testing can’t answer these specific questions. But unless you’re doing regular testing of your designs then it’s probably not the best use of your resources to approach testing in this way.
Testing for measurement
Usability testing can help you to assess the usability of key tasks on your website. But if the purpose of the test is to provide a measurement then 5 participants is not enough. Remember that 5 participants is a figure that has been arrived at because of the number of issues it throws up and not because of any correlation with the success rates of larger numbers.
If you want your testing to provide a “How easy is this task?” measurement then you need to increase your numbers. It also has implications for how you facilitate the session.
By explaining to my clients that we’re answering the “What needs fixing?” question I try to convince them to set aside any ”How easy is this task?” measurement. This is because it doesn’t actually help you make any design decisions, it’s just interesting to know.
Of course not all clients can simply set this question aside, often for internally political reasons.
What d’you think?
Am I talking rubbish or do you agree? We can talk about it if you leave a comment.