Some people call back that usability is very costly and complex and that user tests should be reserved for the rare web design project with a huge budget and a lavish time schedule. Not truthful. Elaborate usability tests are a waste of resource. The all-time results come up from testing no more than five users and running as many small-scale tests every bit y'all can afford.

In earlier enquiry, Tom Landauer and I showed that the number of usability bug institute in a usability test with n users is:

Due north (one-(1- L ) n )

where N is the full number of usability issues in the design and L is the proportion of usability problems discovered while testing a single user. The typical value of L is 31%, averaged beyond a large number of projects we studied. Plotting the curve for L =31% gives the post-obit result:

Increase in proportion of usability problems found by number of test users

The most striking truth of the bend is that nix users give zero insights.

Equally soon as you collect data from a unmarried examination user, your insights shoot up and you have already learned nigh a third of all at that place is to know about the usability of the blueprint. The divergence between zero and even a little bit of data is astounding.

When you lot test the 2nd user, yous will discover that this person does some of the same things equally the first user, then there is some overlap in what you larn. People are definitely different, so at that place will also be something new that the second user does that you did not observe with the outset user. So the second user adds some amount of new insight, but non nearly as much as the first user did.

The third user will do many things that you lot already observed with the first user or with the second user and fifty-fifty some things that you have already seen twice. Plus, of course, the 3rd user will generate a pocket-size amount of new data, even if not as much as the first and the second user did.

As yous add more and more users, you learn less and less because y'all will keep seeing the same things once again and again. There is no real demand to keep observing the same thing multiple times, and y'all volition be very motivated to become back to the drawing lath and redesign the site to eliminate the usability problems.

After the fifth user, you are wasting your time by observing the same findings repeatedly but not learning much new.

Iterative Design

The curve conspicuously shows that you lot need to test with at to the lowest degree 15 users to discover all the usability issues in the design. And then why do I recommend testing with a much smaller number of users?

The main reason is that information technology is improve to distribute your budget for user testing across many pocket-size tests instead of blowing everything on a single, elaborate study. Let us say that you do have the funding to recruit 15 representative customers and have them test your design. Great. Spend this budget on three studies with five users each!

You lot desire to run multiple tests because the existent goal of usability technology is to better the design and not just to certificate its weaknesses. After the first written report with v participants has constitute 85% of the usability problems, you lot will desire to set up these problems in a redesign.

After creating the new design, you need to test once more. Even though I said that the redesign should "set" the issues found in the first study, the truth is that you think that the new pattern overcomes the problems. Merely since nobody tin blueprint the perfect user interface, in that location is no guarantee that the new design does in fact prepare the bug. A second exam will observe whether the fixes worked or whether they didn't. Also, in introducing a new design, there is always the risk of introducing a new usability problem, fifty-fifty if the old i did become fixed.

Also, the second study with v users will observe most of the remaining 15% of the original usability problems that were not found in the first round of testing. (There will still be 2% of the original issues left — they will accept to await until the third study to be identified.)

Finally, the second study volition be able to probe deeper into the usability of the primal construction of the site, assessing bug like information architecture, job menses, and match with user needs. These of import issues are often obscured in initial studies where the users are stumped by stupid surface-level usability problems that prevent them from really digging into the site.

So the second study will both serve as quality assurance of the outcome of the first study and help provide deeper insights as well. The second written report will ever lead to a new (just smaller) listing of usability problems to fix in a redesign. And the same insight applies to this redesign: non all the fixes will work; some deeper problems will be uncovered after cleaning upwards the interface. Thus, a third report is needed as well.

The ultimate user experience is improved much more past 3 studies with five users each than by a single monster study with 15 users.

Why Not Exam With a Single User?

Yous might call up that xv studies with a unmarried user would be even better than 3 studies with 5 users. The curve does testify that nosotros larn much more than from the first user than from whatsoever subsequent users, so why keep going? Two reasons:

  • There is always a chance of being misled by the spurious behavior of a single person who may perform certain deportment past accident or in an unrepresentative style. Fifty-fifty 3 users are enough to get an idea of the diversity in user beliefs and insight into what's unique and what can be generalized.
  • The cost-do good analysis of user testing provides the optimal ratio around 3 or v users, depending on the mode of testing. There is always a fixed initial cost associated with planning and running a written report: it is better to depreciate this start-up cost across the findings from multiple users.

When To Test More Users

You need to test additional users when a website has several highly singled-out groups of users. The formula but holds for comparable users who will be using the site in fairly like ways.

If, for instance, you have a site that volition exist used by both children and parents, then the two groups of users will have sufficiently different beliefs that it becomes necessary to test with people from both groups. The same would exist true for a organization aimed at connecting purchasing agents with sales staff.

Even when the groups of users are very different, there volition still be dandy similarities betwixt the observations from the two groups. All the users are homo, later on all. As well, many of the usability problems are related to the cardinal way people interact with the Web and the influence from other sites on user beliefs.

In testing multiple groups of disparate users, you don't need to include as many members of each group as you would in a single examination of a single group of users. The overlap between observations will ensure a amend result from testing a smaller number of people in each grouping. I recommend:

  • three–4 users from each category if testing two groups of users
  • 3 users from each category if testing iii or more groups of users (you always want at least iii users to ensure that y'all have covered the diversity of behavior inside the grouping)

Reference

Nielsen, Jakob, and Landauer, Thomas One thousand.: "A mathematical model of the finding of usability bug," Proceedings of ACM INTERCHI'93 Briefing (Amsterdam, The Netherlands, 24-29 April 1993), pp. 206-213.

Follow-Up Manufactures

  • Newer analysis of the problem discussed in this article: How Many Test Users in a Usability Study?
  • Quantitative Studies (usability metrics): Test xx Users
  • Carte Sorting: Test 15 Users