Why Sampling Doesn’t Seem to Work
Edward J. Stanek III
Abstract
The idea of random sampling a finite population has broad application in statistics. Response is recorded on a
subset (i.e. sample) of the population with a goal of estimating the population average response. By picking the
members of the subset in a random manner, a statistical argument is used to draw conclusions about the
population average based on the sample response. The same conclusions do not follow if the subset is chosen on
purpose. Since the same responses will be observed in each setting, how can the inference from sampling be so
different? This is the ‘magic’ of sampling which may have motivated Mark Twain to say: ‘There are three kinds
of lies: lies, damned lies, and statistics.’ There have been doubters about the validity of inference from
probability sampling for a long time. We illustrate using a geometric framework how the added insight attributed
to probability sampling appears to be false. We suggest that apart from using probability sampling as an
approach to provide face validity for making unbiased data collection decisions, a preference should be given to
statistical inference approaches other than probability sampling.
Full Text: PDF