🏠 Alejandro Barreto — Software Engineer in Austin, TX

5 samples or even 1 sample is enough!

Rule of 5:

The Rule of Five estimates the median (middle point) of a population [of ANY size]. Half of the population is above a certain measure, half is below. There is a 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from that population. It might seem impossible to be 93.75% certain about anything based on a random sample of just five, but it works.

The article goes into a good example, copied below: http://nsfconsulting.com.au/rule-of-five-reduce-uncertainty/

For example, let’s say you want to consider whether your office is in the most convenient location for your employees. You could conduct a full office-wide survey to get a consensus on this question, but it would be time consuming and expensive and would probably give you more precision that you need.

Suppose, instead, you just randomly pick five people. There are some other issues you would need to consider about ‘randomness’ but for now, let’s say you simply choose five employees at random. Call these people and ask them how long their commute to work typically is. When you get answers from five people, stop. Let’s suppose the values you get are 30, 60, 45, 80 and 60 minutes. Take the highest and lowest values of the sample of five: 30 and 80. There is a 93.75% chance that the median of the entire population of employees is between those two numbers. This, according to Douglas Hubbard, is the Rule of Five. The Rule of Five is simple, it works, and it can be proven to be statistically valid for a wide range of problems. With a sample this small, the range might be very wide, but if it was significantly narrower than your previous range (that is, the range of the unknown), then it counts as a measurement.

As Hubbard adds:

This may seem like a wide range, but that’s not the point. The relevant point is whether this range is narrower than your previous range. Maybe you previously thought that 5 minutes per day or 2.5 hours per day were reasonable given what you knew at the time. These values now would be highly unlikely to be medians for the population. Even with a small measurement of just five people, you significantly narrowed your range of uncertainty. If your uncertainty was that high before, you now have a much better idea.

A single sample is enough:

This is called the Single Sample Majority Rule, which put formally says, “Given maximum uncertainty about a population proportion – such that you believe the proportion could be anything between 0% and 100% with all values being equally likely – there is a 75% chance that a single randomly selected sample is from the majority of the population [(a population of ANY size)].”

The source of both of these rules: https://hubbardresearch.com/two-ways-you-can-use-small-sample-sizes-to-measure-anything/

That’s why even though we only have 5 survey responses, we can still use this info!