Status:Closed    Asked:Jul 03, 2017 - 06:12 PM

Is it possible to have more observations that fit a certain criteria in a revised extract than in the original?

Last year, I used an extract of data that had the occupation "occ2010" variable in it. The analysis I am conducting only keeps the observations of people who work in the construction sector. Today, I revised the old extract and included two new variables "EMPSTAT" and "LABFORCE." However, when I condense my data set by keeping only the construction occupations, even when I use the exact same commands as before, the newly condensed data set with the revised data has more observations in it than the original. I am trying to replicate my previous research while excluding people who were not in the labor force. However, I cannot replicate it exactly due to seemingly different observations in the revised data set. Is there a way that I can match the two data sets identically?

Do you have the same question? Follow this Question

Staff Answer


Jeff Bloem


Occasionally, IPUMS updates variables to improve upon known errors. This could cause differences when data extracts are revised and resubmitted. However, this doesn't seem to be what is necessarily happening in your case. I just looked at your extract 30 and the revised extract 41, and found no difference in the number of observations of those who work in the construction sector. Perhaps you can send the code you are using to limit your sample to only construction workers? Another alternative, if you want to perfectly replicate your original analysis, is to merge EMPSTAT and LABFORCE onto your original data set using YEAR, DATANUM, SERIAL, and PERNUM as identifiers.

I hope this helps.


Jul 05, 2017 - 09:48 AM

Report it


Thank you for your response. After double checking the data, I come to the same conclusion as you. I must have done something to my data set at some point in my previous analysis and did not record what I had done.


Jul 18, 2017 - 03:33 PM

Report it


The degree of disability in Indonesia 2010 census has three levels but the ta...
Occupation by MSA
Besides the variable for limitations with IADL, are there any other variables...
CPS Volunteer Supplement 2002-2015 - Differences between vlstatus and vlhallo...
When did IPUMS-CPS start like the way (easily downloadable way) like now?
Replicate weights for use with CPS-TUS
Login   |   Register

Recently Active Members

View More »

Share |