What households are grouped in a single strata in IPUMS-International?

I am looking for clarification on the precise nature of "strata" in most of the IPUMS-International samples. From the page on comparability of this variable across samples, "strata" is described as:

"In most samples, the STRATA variable captures implicit geographic stratification and is created by assigning a unique identifier to groups of between 10 and 19 adjacent households within low level."

My confusion is about what this means in the context of IPUMS being, for instance, a 5% sample of all households. Consider a hypothetical sample that contains 5% of all households. The strata variable has 10 households per strata. Does this mean:

A. The data include 5% of households in each strata. In this case, each strata in reality has 200 households, but data users see only 5% (10) of them, presumably selected randomly.

B. The data include all households in each strata, but only 5% of all strata. In this case, there are 10 households per strata, but there are 20 times more strata in reality than we see in the data.

I am confused because the description calls households in the same strata "adjacent," which implies -- to me -- that a strata includes ten contiguous households, rather than 10 households selected out of a contiguous set of 200.

Does anyone have any insight into this question? It is probably not important for most uses, but it is critical for my purposes.

Staff Answer


Jeff Bloem


All IPUMS International samples are stratified. This means the population is divided into strata based on geography or other key characteristics. Then, a sample is drawn from each stratum. So, this aligns most closely with case A in your question. More information about sampling design in IPUMS International can be found via this page.

I hope this helps. Let us know if you have any additional questions.


