Question

Status:Closed    Asked:Jul 03, 2017 - 06:12 PM

Is it possible to have more observations that fit a certain criteria in a revised extract than in the original?

Last year, I used an extract of data that had the occupation "occ2010" variable in it. The analysis I am conducting only keeps the observations of people who work in the construction sector. Today, I revised the old extract and included two new variables "EMPSTAT" and "LABFORCE." However, when I condense my data set by keeping only the construction occupations, even when I use the exact same commands as before, the newly condensed data set with the revised data has more observations in it than the original. I am trying to replicate my previous research while excluding people who were not in the labor force. However, I cannot replicate it exactly due to seemingly different observations in the revised data set. Is there a way that I can match the two data sets identically?

 
Do you have the same question? Follow this Question
 

Staff Answer

avatar

Jeff Bloem

Staff

Occasionally, IPUMS updates variables to improve upon known errors. This could cause differences when data extracts are revised and resubmitted. However, this doesn't seem to be what is necessarily happening in your case. I just looked at your extract 30 and the revised extract 41, and found no difference in the number of observations of those who work in the construction sector. Perhaps you can send the code you are using to limit your sample to only construction workers? Another alternative, if you want to perfectly replicate your original analysis, is to merge EMPSTAT and LABFORCE onto your original data set using YEAR, DATANUM, SERIAL, and PERNUM as identifiers.

I hope this helps.

 

Jul 05, 2017 - 09:48 AM

0
0
Report it

Answers

Thank you for your response. After double checking the data, I come to the same conclusion as you. I must have done something to my data set at some point in my previous analysis and did not record what I had done.

 

Jul 18, 2017 - 03:33 PM

0
0
Report it

OTHER QUESTIONS NEEDING ANSWERS

I am looking for the Kessler 6 variable in IPUMS-MEPS
Reason for missing MIGSTA1 values in ASEC files for 1985 (and 1995)
India Area-level data (geo2) inconsistent for total population, urban share, ...
Does anyone have a method for handling missing data in CPS "EDUC" variable? 2...
How do I find the data file:Ipumsi_00001.dat for the 1974
Variables from Full Year Consolidated Files currently not included in IPUMS-M...
Login   |   Register

Recently Active Members

View More »

Share |