Status:Closed    Asked:Nov 27, 2013 - 01:51 PM

What are the identifiers on the IPUMS CPS to merge files?

I'm using the family interrelationship variables to connect the recipiency of public benefits of children to the entire family. I need to do some merges, however, I get a error message saying that I do not have all the needed identifiers to do the merge. I'm using year serial and pernum, what other identifiers do I need?

Do you have the same question? Follow this Question

Staff Answer




I am not entirely certain I understand what you are attempting to do, so I will give a few brief ideas and if these do not meet your specific question, please respond and elaborate.

The family interrelationship variables take there values from the PERNUM variable which identifies individuals within a household. There is some sample syntax available in the end notes of the Family Interrelationships page for attaching characteristics from one family member to another.

If you are trying to attach the characteristics of one person to the rest of the household (using household as a proxy measure for family, which may be problematic), you can create a data set containing only one person per household with the information you want to attach to the other household members (being sure to change variable names to prevent conflicts with the characteristics of other household members, for example, age becomes childs_age), and then merge one-to-many on serial.

If you are attempting to attach information not available in IPUMS-CPS that you downloaded from another data source, I recommend looking at this response. If you are attempting to link persons across March samples, I recommend looking at thisresponse.

I hope this helps. Again, feel free to ask further questions.


Dec 02, 2013 - 12:14 PM

Report it



Thanks for your answer. I understand the logic of the family variables. I used the simple syntax found on the family interrelationships technical appendix to guide my analysis, using STATA; however, that syntax is not sufficient as the merge command has changed.

After using "merge year serial pernum using temp.dta"

STATA tells me that I'm using the old merge syntax and that the variables year serial pernum do not uniquely identify observations in my temp data.

What I would like to know is what are the other keys that identify each observation on the IPUMS CPS?

I know my merge is wrong because my main dataset ends up with more observations after the merge, and I should end up with the same number of observations because I'm adding columns to my dataset not rows.

Thanks for your help.


Dec 02, 2013 - 02:38 PM

Report it

Based on the breif description of your analysis, it sounds like you are trying to add child information onto parents. The problem may be that multiple children are identified for each parent. If this is the case, you are essenctially trying to do a many-to-many merge, which is generally a bad idea. I recommend looking through this pdf from stata for more information on merging.


Dec 02, 2013 - 03:07 PM

Report it


when trying to declare the IPUMS data as panel on stata there are repeated ti...
2010 5-year PUMA definition
Where can I find city populations, by age, San Diego, New Orleans, Atlanta, C...
Hi, it has been over 4 hours for my data extract and I was wondering if there...
When I take several Nov Supplements, the weighted number of voters equals pub...
Are there aggregate categories for OCC/OCCLY in CPS?
Login   |   Register

Recently Active Members

View More »

Share |