But how many do you need in a sample?

Andrea Osika
Dec 9, 2020
2 min read

Updated: Mar 25, 2021

When taking a sample to collect data on a population, you need to consider a few things:

population size: To understand the population so that an apples to apples comparison can be made is critical. Knowing how many apples there are helps us make a quantitative assessment of if our sample size is adequate. However, it's not uncommon to not know the exact value here.
confidence interval: Errors happen, it's nature. Finding an appropriate margin of error. You can figure this by comparing the average (the mean) of your population to that of your sample and deciding what value the difference can be. These typically hover around 95% with a standard deviation of ~5% in either direction.
standard deviation: If there are known values, this is a calculated value. A low standard deviation means that all the values will be clustered around the mean number, whereas a high standard deviation means they are spread out across a much wider range with very small and very large outlying figures. In practice, we often do not know the value of the population standard deviation (σ). However, if the sample size is large (n > 30), then the sample standard deviations can be used to estimate the population standard deviation. When surveying it's quite common to use .5 .
Z-score of your confidence interval: There are tables for this. Basically, this number represents how many standard deviations you are away from the mean. The majority of the data is less than 3 standard deviations in either direction (values of up to +3 or -3) Here's one that SJSU uses for varying confidence levels and this site is also useful for CI and StdDev info

The most common CI'S that eluded to

with the corresponding Z-Scores are:

CI: Z-Score

90% 1.65

95% 1.96

99% 2.576

Once we've decided this, we can calculate the sample size for general use:

So let's say we want to use the general 95% confidence interval and a StdDev of .5 for generalities:

n = (1.65 x 1.65) x .5 x .5 / (.05x.o5)

n = 2.7225 x .25 / .0025

n = .6806 / .0025

n = 272

So to figure out our sample size:

Statistically, we can use 272 samples to reflect the population's values with 95% confidence. What if you don't have access to 272 samples?? You can adjust your confidence interval or margin of error. These values will, unfortunately, increase the chance of errors in your findings, but you won't need as many respondents.

This simple-ish calculation can be implemented validate sample sizes - I hope you found it useful and can apply it if you had any doubt.

Additional reading:

Benchmarking Sample Sizes

The majority of the information here was found here.

But how many do you need in a sample?

So to figure out our sample size:

Recent Posts

Comentários