

While the confidence interval width will be rather wide (usually 20 to 30 percentage points), the upper or lower boundary of the intervals can be very helpful in establishing how often something will occur in the total user population.įor example, if you wanted to know if users would read a sheet that said “Read this first” when installing a printer, and six out of eight users didn’t read the sheet in an installation study, you’d know that at least 40% of all users would likely do this–a substantial proportion. When you want to know what the plausible range is for the user population from a sample of data, you’ll want to generate a confidence interval. The online calculator handles this for you and we discuss the procedure in Chapter 5 of Quantifying the User Experience. When expected cell counts fall below one, the Fisher Exact Test tends to perform better.

This is a variation on the better known Chi-Square test (it is algebraically equivalent to the N-1 Chi-Square test). It’s been shown to be accurate for small sample sizes.Ĭomparing Two Proportions: If your data is binary (pass/fail, yes/no), then use the N-1 Two Proportion Test. The right one depends on the type of data you have: continuous or discrete-binary.Ĭomparing Means: If your data is generally continuous (not binary), such as task time or rating scales, use the two sample t-test. If you need to compare completion rates, task times, and rating scale data for two independent groups, there are two procedures you can use for small and large sample sizes. Here are the procedures which we’ve tested for common, small-sample user research, and we will cover them all at the UX Boot Camp in Denver next month.

Again, the key limitation is that you are limited to detecting large differences between designs or measures.įortunately, in user-experience research we are often most concerned about these big differences-differences users are likely to notice, such as changes in the navigation structure or the improvement of a search results page. Just as with statistics, just because you don’t have a large sample size doesn’t mean you cannot use statistics. Galileo, in fact, discovered Jupiter’s moons with a telescope with the same power as many of today’s binoculars. But just because you don’t have access to a high-powered telescope doesn’t mean you cannot conduct astronomy. You are limited to seeing big things: planets, stars, moons and the occasional comet. To put it another way, statistical analysis with small samples is like making astronomical observations with binoculars. While there are equations that allow us to properly handle small “n” studies, it’s important to know that there are limitations to these smaller sample studies: you are limited to seeing big differences or big “effects.” Studies involving fMRIs, which cost a lot to operate, have limited sample sizes as well as do studies using laboratory animals. There are appropriate statistical methods to deal with small sample sizes.Īlthough one researcher’s “small” is another’s large, when I refer to small sample sizes I mean studies that have typically between 5 and 30 users total-a size very common in usability studies.īut user research isn’t the only field that deals with small sample sizes. Put simply, this is wrong, but it’s a common misconception. Some people think that if you have a small sample size you can’t use statistics.
