Question

Status:Closed    Asked:Jul 06, 2017 - 02:04 AM

Why is the concatenation of YEAR, SERIAL, and PERNUM not unique?

It is my understanding that these three variable should produce unique observations for each person. I combined these and found that 98 people have identical values on these three variables as someone else. Why does this arise? I am looking at all observations from all Voter Supplement data.

 
Do you have the same question? Follow this Question
 

Staff Answer

avatar

Jeff Bloem

Staff

It seems what is causing these duplicates is the software is dropping the leading zeros on SERIAL and PERNUM. If you add these back in (SERIAL should be 5 digits and PERNUM should be 2), and redo the concatenation, you should generate a unique id across your samples. Additionally, CPSIDP is a read-made variable that, when concatinated with YEAR and MONTH, uniquely identifies people across IPUMS CPS samples. In your data set of all Voter Supplements, however, CPSIDP uniquely identifies all people because you are only including one month per year.

 

Jul 06, 2017 - 09:56 AM

1
0
Report it

Answers

That was correct, good catch. Thank you.

 

Jul 06, 2017 - 12:13 PM

0
0
Report it

OTHER QUESTIONS NEEDING ANSWERS

unique person id variable?
Login   |   Register


Share |