STRATA Variable in 2022 IPUMS USA

Jamie_Fleishman · March 3, 2024, 7:22pm

ACS 2022 5-year data does not have a strata variable. How would one use survey weights and declare a survey design for the 5-year 2022 ACS in Stata?

Isabel_Pastoor · March 5, 2024, 3:29pm

The lowest geographic unit identified in the ACS public use microdata sample (PUMS) file is the PUMA, an area containing 100,000 persons. IPUMS geographers infer other geographic units (e.g., cities, counties) where possible. The variable STRATA is created by the IPUMS team using PUMA. Beginning with the 2022 ACS, PUMA boundaries were based on the 2020 decennial census. The 2022 5-year ACS sample includes data that use both the 2010 PUMA definitions (2018-2021) and data that use the 2020 PUMA definitions (2022). Our initial release of the 2022 5-Year ACS PUMS does not include geographic identifiers for areas smaller than states (including STRATA) as they require special handling of these different PUMA definitions. We plan to release more detailed geographic units throughout the spring, and aim to provide the most popular variables sometime this week. Check the revision history page (or your email) for an announcement of the new variables.

Fortunately, you don’t need STRATA to set your weights or account for sample design in analyses of ACS microdata. The Census Bureau recommends using replicate weights to obtain empirically derived standard errors of these data, and the replicate weights are available for the 2018-2022 5-year ACS PUMS files via IPUMS.

Jamie_Fleishman · March 11, 2024, 6:58pm

Thank you Isabel Pastoor!
I have another question: I’m having a hard time using replicate weights, it’s taking an extremely long time to run svy: tab multyear after svyset [pweight=perwt], vce(brr) brrweight(repwtp1-repwtp80) fay(.5) mse
Is this a common issue do you know?

Thanks again!

Isabel_Pastoor · March 13, 2024, 5:02pm

In my personal experience using Stata, running commands using the svy option tend to take awhile. I don’t see anything wrong with your code. If you are working with a very large dataset, on a computer with limited processing power, or a less powerful or older version of Stata, you may expect these commands to take awhile.

Here are two suggestions that may help:

Run your analysis on single years of ACS data, one at a time
Run your code without replicate weights first to determine if the code is efficient/fast and error-free before adding the replicate weights

If running your analysis is still untenably slow, you may want to perform a check with STRATA and compare the standard errors you get using STRATA versus using replicate weights. The standard error estimates are sometimes, but not always, significantly different using these two methods, depending on the particulars of the analysis.

Topic		Replies	Views
Why doesn't the 5-year ACS in IPUMS contain a variable for Census Tract? I would have expected this variable.	1	1634	August 9, 2018
ACS five year estimates for 2016-2020 USA	2	346	April 23, 2022
Release plan for 2021 5-year ACS USA	4	205	February 17, 2023
2000 ACS data for 1-year migration flows INTERNATIONAL	3	340	October 9, 2014
ACS 5-year 2021 data USA	2	145	March 14, 2023

STRATA Variable in 2022 IPUMS USA

Related Topics