Assessment of Required Sample Sizes for Estimating Proportions
Steven T. Garren *
Department of Mathematics and Statistics, James Madison University, Harrisonburg, VA 22807, USA.
Brooke A. Cleathero
Department of Mathematics and Statistics, James Madison University, Harrisonburg, VA 22807, USA.
*Author to whom correspondence should be addressed.
Abstract
When estimating a population proportion p within margin of error m, a preliminary sample of size n is taken to produce a preliminary sample proportion y/n, which is then used to determine the required sample size (y/n)(1-y/n)(z/m)2, where z is the critical value for a given level of confidence. The population is assumed to be infinite, so these Bernoulli(p) observations are mutually independent. Upon taking a new sample based on the required sample size, the coverage probabilities on p are determined exactly for various values of m, n, p, and z, using a commonly-used formula for a confidence interval on p. The coverage probabilities tend to be somewhat smaller than their nominal values, and tend to be a lot smaller when np or n(1 - p) is small, which would result in anti-conservative confidence intervals. As a more minor conclusion, since the given margin of error m is not relative to the population proportion p, then the required sample size is larger for values of p nearest to 0.5. The mean and standard deviation of the required sample size are also computed exactly to provide prospective, regarding just how large or how small these required sample sizes need to be.
Keywords: Bernoulli distribution, binomial distribution, sample size determination, confidence interval