Status:Closed    Asked:Jul 06, 2017 - 02:04 AM

Why is the concatenation of YEAR, SERIAL, and PERNUM not unique?

It is my understanding that these three variable should produce unique observations for each person. I combined these and found that 98 people have identical values on these three variables as someone else. Why does this arise? I am looking at all observations from all Voter Supplement data.

Do you have the same question? Follow this Question

Staff Answer


Jeff Bloem


It seems what is causing these duplicates is the software is dropping the leading zeros on SERIAL and PERNUM. If you add these back in (SERIAL should be 5 digits and PERNUM should be 2), and redo the concatenation, you should generate a unique id across your samples. Additionally, CPSIDP is a read-made variable that, when concatinated with YEAR and MONTH, uniquely identifies people across IPUMS CPS samples. In your data set of all Voter Supplements, however, CPSIDP uniquely identifies all people because you are only including one month per year.


Jul 06, 2017 - 09:56 AM

Report it


That was correct, good catch. Thank you.


Jul 06, 2017 - 12:13 PM

Report it


unique person id variable?
Unique household/person identifiers for matching basic monthly CPS data longi...
Same Serial and Pernum as Lag Observation
Why is there only 1,524 unique municipal values in the 2010 Brazil census dat...
Login   |   Register

Recently Active Members

View More »

Share |