I have found some things That seem inconsistant in the way that household weights and person weights line up.

Dear folks –

I have downloaded a file that contains most of the IPUMS-CPS variables, 1962-2012. I have managed to bite off the top 100 records, and there are some things I am confused about.

  1. The RECTYPE for every variable is H.
  2. Whenever NUMPREC is n, there are n consecutive rows with that NUMPREC. This suggests I am looking at person-level records with the household information duplicated into each.
  3. For those same n rows, the HWTSUPP is always duplicated n times. I hope this means that it has been divided by n before being copied into each person record. Otherwise these weights will overweight larger households by a factor of n, correct?
  4. Some of the person weights and the household weights seem to be out of alignment by row. For example, the first two households with NUMPREC=2, in rows 8 and 10, each have the household weight duplicated in the next consecutive row, and the WTSUPP likewise. (These are at the beginning of 1962) But the first household with a NUMPREC=3, SERIAL=13, is in row 17, with the value duplicated in rows 18 and 19 of HWTSUPP, while the first run of three consecutive identical weights in WTSUPP begins in row 19 and goes to row 21. The first household with NUMPREC=4, SERIAL=29, begins in row 41, again with four consecutive identical HWTSUPP values. However, I was not able to find four consecutive individuals with the same WTSUPP anywhere in the first 150 lines. Am I wrong that two individuals in the same household must have the same sampling weight as one another, equal to the household weight over n?

Sincerely, Andrew Hoerner

Most of what you noticed is a result of what we call “Rectangularization”, which means (as you seem to have guessed) that all household-level characteristics are attached to the end of each person-level record within that household (this is why you see NUMPREC=n, a household-level variable, in n consecutive rows). The variable RECTYPE is only useful when working with a "Hierarchical"data structure which lists household characteristics on a separate row, with household members on the rows directly beneath the household record.
HWTSUPP is the total weight (not divided by number of persons) for the household. If you are doing household-level analyses with a Rectangular data structure and wish to use HWTSUPP it is important to select only one person per household so that the household is not over-weighted as you pointed out.
WTSUPP is not dependent upon household, so household members are not expected to have identical weights. You can read a bit more about weightshere.
I hope this helps.