A review of IntuitionHQ
Posted September 4th, 2012 by David Hamill
I received a nice little email from the makers of IntuitionHQ the other day asking if I’d be interested in trying out their new usability testing software. I’ve had a play with it and written this post to share my impressions of it with you.
What does it do?
IntuitionHQ allows you to upload images of your designs and then test them online by giving participants tasks to complete. The participant clicks the design to show where they would click to progress with each task. It’s possible to test two designs side-by-side with an A/B test option.
It calls itself usability testing which it kind of is, but isn’t really a replacement for one-on-one usability testing. With a few improvements I can definitely see a use for it. But we’ll get to that in a bit.
An example task
I created 4 tasks, all of them with A/B options. The A design was the original design and a B design with a slight alteration I’d made. Instead of going into detail about all of them, I’ll illustrate with the first task instead.
The image below shows the product page of the John Lewis website. They have included a “Have you thought about” option that I worried would add a bit of confusion. The Add to basket button is more strongly associated with the toaster than it is the kettle. I didn’t think many people would click the wrong place but I did worry it might delay things a bit.

If you reduce unnecessary friction in a process you tend to increase conversions on sites with lots of traffic, so these details can be important. Below you can see the B option I made. As you can see the toaster isn’t on it. To be clear, I’m not trying to test which design would be most profitable. I just want to see if the results between the two differ and if so, to what extent.

The results
You can see the results of my test (hopefully this link won’t die) for yourself. The image below shows the visual representation of where people have clicked. If you select the enlarged view of a heatmap you get some numbers and percentages. This approach is a little clunky for an A/B comparison and I feel a better solution should be possible. Ideally you’d want to compare the numbers between the two designs by looking at one screen.

The task was attempted by 39 people on the original design and 41 people on the revised design. On the original design 82% of participants clicked the Add to basket button and the average time taken on the task was 15.5 seconds. Whereas on the revised design with the cross-sell removed (pictured above) 93% of participants clicked the Add to basket button and the average time taken was 12.46 seconds.
Limitations
Remote online usability testing has its limitations regardless of the software you use to carry it out . The most significant is that there will always be some participants who misunderstand the task. The more complex the task the more likely is is to be misunderstood. You can’t see them misunderstanding it so you often can’t identify them. Of course you have the benefit of cheaply using increased numbers in order to smooth out the impact of this.
In this test I wasn’t trying to find out whether people knew to click the Add to basket button. Instead I was trying to assess how much friction the cross-sell for the toaster would introduce to a fairly simple action. This is the type of thing I’d use an online solution for, very simple tasks that allow me to test a theory.
I don’t think I’ve been able to answer the research questions I had using IntuitionHQ but with a few changes this could be resolved.
In this task I wanted to know if there was a significant difference in the time taken to click the Add to basket button because of the presence of a cross-sell option. At first glance it looks as though there is.
The average time taken on the task is 24% longer on the original design. But in order to get an accurate figure I want to identify and remove any outliers. I also want to be able to compare the average times while both including and excluding participants who didn’t click the right button.
It’s perfectly possible that one participant can influence the average time by leaving the task open for a minute while being distracted by something external (like the phone ringing). In fact on one of the other tasks I saw evidence of this while the results were coming in.
Conclusion
I like the A/B split feature as I’ve had to hack other tests to do this in the past. But at present the timing data IntuitionHQ provides is too basic to be useful. I could hack the effects of this a bit by running several identical tests side-by-side to highlight abnormal results I suppose.
I asked the company by email if they could give me the individual times for the results but unfortunately they were unable to do so. I was told that the software excludes answers that are longer than 2 minutes. When you’re dealing in the odd second here or there 2 minutes is a hefty amount of time. That’s not good enough for me.
If it were possible to isolate only the results you were interested in and also exclude outliers this could be useful software. But without this it’s impossible to have confidence in the results. I have no doubt that some people will love it.
As it stands I can only see it being useful to see if people will click in the right place. For some people that will be enough though.
What do you think?
Do you agree with me or am I being too harsh? Perhaps you have some experience with this product you’d like to share. Leave a comment and we can get the discussion going.






4 Responses to “A review of IntuitionHQ”
September 4th, 2012 at 9:41 pm
Hi David,
Thanks for giving us IntuitionHQ a go. It’s a pretty easy fix to change our current outlier exclusions from 2 mins down to (say) 20 or 30 secs if you think that would solve your issue with outliers.
John, IntuitionHQ
September 4th, 2012 at 10:08 pm
Hi John, thanks for the comment. It’d need to identify an outlier relative to the other answers. For some tasks 20-30 seconds might be appropriate in which case excluding them wouldn’t be appropriate. Now this is where my knowledge on statistics will let me down. But if the outlier exclusion could be dictated from a formula that takes into account the other answers then that would be ideal. Otherwise a feature where you can adjust the maximum time and see what it does to the average would at least allow users to make a judgement on it.
September 5th, 2012 at 9:31 am
I’ve not used IntuitionHQ in the past, but I’ve used a similar system called Verify (verifyapp.com). These kinds of test are incredibly cheap to run and really fit the current zeitgeist of running lean. Although I don’t think they’re a replacement for traditional, in-person usability tests they’re clearly a quick and simple way to answer very specific questions about a design. Anyway, here’s my question. Can you chain together (say) 5 screens that represent the 5 screens the user might see on a task? And can you distinguish the people who made every right choice on all 5 screens from people who (say) chose the wrong option on screen 3?
September 5th, 2012 at 11:48 am
Hi David thanks for the comment. The answer to your questions are yes and no.
Comment on this article