Research Spotlight

Too Big to Fail

Professor Galit Shmueli and her co-authors recently had their paper, “Too Big to Fail: Large Samples and the P-Value Problem,” accepted for publication by the top-tier journal, Information Systems Research.

Professor Shmueli presents a summary of the paper.

The Internet has provided researchers with the opportunity to conduct studies with extremely large samples, frequently well over 10,000 observations. There are many advantages to using large samples, but researchers using classic statistical inference must understant the p-value problem associated with them.

What is p-value?

Every basic course in statistics introduces the “p-value” concept. The p-value measures the risk of detecting a false discovery when testing an effect in a data sample, as opposed to using the entire population. An effect that has a low p-value is said to be “statistically significant.” Researchers thus rely on p-values to imply statistical significance and in turn reach conclusions regarding scientific discoveries that are generalizable to the entire population.

P-values and statistical significance are functions of the sample size. A small sample might not have sufficient power to detect an effect, while with a larger sample the same effect is statistically significant. For example, an average 10-kg weight difference between two diets might be statistically insignificant (high p-value) if each diet group consists of five people, but statistically significant (low p-value) if each group consists of 500 people.

Awareness and mitigation of the p-value problem in large samples

The challenge arises in very large samples, where p-values go quickly to zero even for extremely weak effects. With a sufficiently large sample, an average 100-gram weight difference between the two diet groups will have a near-zero p-value. Although statistically significant, it is obvious that a 100-gram difference is practically unimportant. Hence, solely relying on p-values can lead the researcher to claim support for results of no practical significance.

In a survey of large sample research in the field of Information Systems, we found that a significant number of papers rely on a low p-value and the sign of a regression coefficient alone to support their hypotheses. This research commentary recommends a series of actions the researcher can take to mitigate the p-value problem in large samples and illustrates them with an example of over 300,000 camera sales on ebay. We believe that addressing the p-value problem will increase the credibility of large sample research as well as provide more insights for readers.