Are sample statistics relevant when comparing census data across time periods?

Question

I've had people tell me on numerous occasions that you can / should do tests normally reserved for sample statistics (e.g., difference of means etc.) on census data when you are comparing figures over time. The claim usually made is that this is because each census is a 'sample in time'.

My understanding is that it is not appropriate to extend sampling theory in this way. Specifically, the explanation below (copied from census statistical techniques) describes my view quite well.

When you have a sample you use inferential stats to generalise to the population. When you have a census you already have data for the whole population, so there is no need to generalise.

For example, if you used sampling, and there is a 3% difference between groups, then you have to use inferential stats to decide whether that 3% difference is real, or just due to random chance when you did the sampling.

But if you did a census, and there is a 3% difference between groups, well, then there's definitely a 3% difference. That 3% difference is not due to random chance in sampling, because you have data for the whole population. However, even with a census you will still need to use your own judgement to think about why there is a 3% difference (for reasons other than random chance in sampling), and whether the 3% difference is large enough to have any practical significance for the work you are doing.

So basically, just use descriptive stats. Correlations are fine, but you only need the r value to show the strength of the correlation, not the p value which is related to random chance in sampling.

A lot of people don't get the difference between sample stats and census stats, and will complain that you didn't do the stats properly. I've had cases where I ended up having to do inferential stats on census data just because people complained so much that there were no p values on anything!

If you have a lot of missing data from a census sometimes you need some fancy inferential stats to fill it in. I doubt this will apply to you, but it does apply to the US population census because (for some bizarre libertarian reason) completing the census survey in not mandatory in the US.

However, my question relates to comparisons across time rather than between groups. The argument I've been given is that when you look across time, each census parameter is actually a single-point sample from all possible parameters across time. I have a couple of problems with this:

differences in census figures across time won't be attributable to chance. They must be attributable to changes in the underlying environment / population and
even if the 'samples in time' argument is correct, you only ever have very few data points sampled from this 'infinite population'--maybe parameters across four or five censuses you want to compare--which means $n$ is so tiny as to be useless for generating sample statistics.

Surely others have addressed this situation more formally, but I've been unable to find material covering this situation. My question therefore is:

Can anyone here point me in the direction of material discussing this issue or offer an explanation of the 'samples in time' argument that provides a more formal foundation for accepting that it is indeed appropriate to use sample statistical tests on census parameters?

Dimiter · Answer 1 · 2015-04-07T16:17:36.833

Sampling variability is only one possible source of error in the estimate of the quantity you are interested in. When you have a census, error due to sampling variability might be null, but there might be other sources of error, like measurement error. So if you compare two estimates from two censuses and the difference turns out to be 3%, this is not necessarily the 'true' difference between the two populations (or the same population at the two periods of time), because of, for example, measurement error.

The tools of statistical inference which have been developed for inference from sample to population might or might not be useful when you want to take into account other sources of error than sampling error. If you can directly model measurement error (and other sources of error that might be relevant), do that.
I know from experience that many people would still conduct (and ask for) hypothesis tests and report p-values and the like with census data, but I can see no justification for such practices. This is not to say that the data should be treated as 'perfect', but the uncertainty better be modeled/discussed directly.

Adam Ryczkowski · Accepted Answer · 2013-07-05T07:49:22.170

2

The problem with studying the change of observables in time (such dataset is formally called "time series"), is that for each case observations in different points in time are dependent, which forces us at least to use tests that are valid for paired data (when comparing two data points), or VECM if you want to analyze all time points at once. In particular, significance test of regressions are invalid in such setup.

When you treat each time point as a separate independent group, you are likely to greatly exaggerate significance.

The question whether use statistical tests or not for census data depends on what is your hypothesis. If you want to infer for this particular group of people in this particular point of time (if the census actually concerns people), than there is no need for significance indeed. But very often we want implicitly to make an inference about some bigger population, e.g. whole Europe. In that case we need to treat our dataset as a sample. Or we want to make inference about our population in future; in this case we would use VECM or similar technique valid for time series.

edited Jul 05 '13 at 07:49

answered Jul 05 '13 at 07:43

Adam Ryczkowski

2,160

This is not an answer to the question. VECM and other time series techniques like VAR are useful when you want to compare two (or more) separate time series, and not if you want to compare one population at two points of time. ARIMA and related ones can be useful for prediction, but this is not the goal as stated in the question. Furthermore, the question is obviously about the situation when you have a census of the entire population you want to generalize about, not a sample of some 'bigger population'. – Dimiter Jan 18 '15 at 15:26
You are right about VECM, but the OP never mentions that he intends to compare only two time points. He never mentions that he compares many time points either (I assumed he did). Maybe you are right, I just don't want to speculate about what the OP really meant. Go ahead and write you own version of the answer and let him decide. – Adam Ryczkowski Jan 18 '15 at 15:37
I think the crucial difference is whether one is interested in 1) whether the populations differ at two (or more) points in time (for which, see my full answer), or 2) whether one is interested in testing for a trend or otherwise extrapolating from these two (or more) observations, in which case the standard statistical inference techniques for time series like ARIMA become relevant, and in which case in makes some sense to consider a census at a point in time as a sample from a super-population. – Dimiter Jan 19 '15 at 21:38

Are sample statistics relevant when comparing census data across time periods?

2 Answers2