Status:Closed    Asked:May 30, 2017 - 11:47 AM

Does a 'revised' extract include everything that the original did?

I used the 'revise' function to increase the type of geographies covered by my original extract (IPUMS-USA), from just certain cities to include the metro areas around those cities plus the cities. In my revision, I marked the desired metros as "special cases" and made sure all the cities were also still marked as "special cases." But when my revised extract was finished, it was about half the size of the original (81000 kb vs 145000kb in unzipped GZ form). I thought the whole revised file would be bigger, not smaller. Or am I supposed to now merge this revised extract with the original extract? Thank you!

Do you have the same question? Follow this Question

Staff Answer




Yes, revising an extract will retain all samples and variables. However, when selecting cases by cities and metro areas, your data will only include cases that meet both specifications. Your first extract selected cases that were within certain cities, but now the revised extract contains cases that are within certain cities AND the chosen metropolitan areas, which is more restrictive. This is why your revised extract is much smaller.

To limit the initial size of your extract, I recommend selecting cases using STATEFIP. Then once the data is read into a statistical package you can whittle down the data to a more specific area.

Hope this helps!


May 31, 2017 - 12:33 PM

Report it


Thanks, that makes sense.


May 31, 2017 - 01:17 PM

Report it


When did IPUMS-CPS start like the way (easily downloadable way) like now?
Replicate weights for use with CPS-TUS
Longitudinal cohort in TUS 2014/2015?
When can we expect NFHS 4 data to included in IPUMS-DHS database?
trying again need more characters
DOINGLW2 variable
Login   |   Register

Recently Active Members

View More »

Share |