Status:Closed    Asked:Apr 29, 2014 - 07:26 PM

How do I create a variable for the racial composition of the PUMA using IPUMS 2000 5% census data?

I would like to create a variable that indicates the racial composition of the respondents' local area. Ideally, it would be at the neighborhood level but as that isn't possible with the publically available census data, I know that I need to use the PUMA variable but I don't know how to do it. The variables that I'd like to create will indicate the percentage of the PUMA that is black, white, etc. so that I can control for racial composition of the respondent's area. I'd appreciate any direction. Thanks!

Do you have the same question? Follow this Question

Staff Answer




In terms of computer processes, there are a number of different ways you could go about attaching PUMA-level racial composition to individual respondents, depending upon which statistical software package you are using. However, what you will ultimately be doing is summarizing the RACE variable at the PUMA level. Whatever summary statistic you choose (e.g. percent white, percent black, or one variable for each racial group), you will essentially be looking at each PUMA individually, finding its weighted RACE summary, and then creating a new variable that contains that summary value for each person within the PUMA. You could do this by fist creating a separate PUMA level dataset that contains your race summaries and then merging those values onto the individual records based on PUMA. Or there may be functions within your statistical software that allow you to generate new variables summarized by a grouping variable (PUMA), such as egen in Stata.

I hope this helps.


May 05, 2014 - 11:42 AM

Report it


Thanks for your response, Joe! This is very helpful. I neglected to mention my software program but I am indeed using Stata so your suggestion about using egen is immensely helpful. Thanks again!


May 05, 2014 - 02:31 PM

Report it

egen mean () will not work as the data need to be weighted with the household or person weights. You would need to create two egen total(), by(puma), and then take the ratio:

egen puma_total_wgt = total( perwt ), by(puma)

egen puma_total_white = total( 1.race * perwt ), by(puma)

gen puma_frac_white = puma_total_white / puma_total_wgt


May 28, 2014 - 06:13 PM

Report it


2014 ACS Migration PUMA: San Francisco Codes Confusion
How do download data to find the gender, race, and education composition of o...
Match IPUMS-CPS with (I)PUMS USA census data 1970-2010 at PUMA/county level
Is there a way to filter geographic information in the IPUMS online data anal...
Login   |   Register

Recently Active Members

View More »

Share |