Status:Closed    Asked:May 09, 2017 - 06:48 PM

Know of any tutorials for using the R survey package with ACS PUMS? Having problems weighting data.

Here's the syntax of my query in R:

> <- svydesign(id=~serial, strata=~strata, data=workers1, weights=perwt)

And here's the result:

Error in inherits(weights, "formula") : object 'perwt' not found

"Perwt" is a variable in the data frame workers1, so the error message doesn't make sense. Serial and strata (a combo of statefip and puma) also are variables in the data frame. I'm trying to create svydesign, a more-or-less mandatory element in the survey package, so that I can run some statistical procedures on the data.

Do you have the same question? Follow this Question

Voted Best Answer

To answer your immediate question: in my data set, perwt is capitalized and R is case-sensitive. "PERWT" works for me -- you also need to be using parenthesis around the weights variable because it's not an object, it's a column name.

I haven't found a good way to use weights in R, but I have been using data.table which aggregates and summarizes quite nicely with weighted data. In may be an option you choose to explore. For my purposes, the svydesign package isn't quite right.




May 09, 2017 - 09:06 PM

Report it


What do you mean that library(survey) isn't quite right? If you are not using it, you are most likely getting your standard errors wrong. (Well, may be you simply don't care about standard errors, which would be a different story.)


May 10, 2017 - 07:56 AM

Report it

Thanks, Dillon. I'm finding the survey package more than a little frustrating. I hadn't realized that data.table was good for weighting data. I'll look into that solution.


May 10, 2017 - 12:13 PM

Report it

I meant quotations, not parenthesis! Ack.

You're right, it messes up the standard error. I've had to recalculate -- again using data.table. The survey design package has a standard set of statistics that are used for calculation and I find it very difficult to analyze data.


May 10, 2017 - 03:50 PM

Report it


I am looking for the Kessler 6 variable in IPUMS-MEPS
Reason for missing MIGSTA1 values in ASEC files for 1985 (and 1995)
India Area-level data (geo2) inconsistent for total population, urban share, ...
Does anyone have a method for handling missing data in CPS "EDUC" variable? 2...
How do I find the data file:Ipumsi_00001.dat for the 1974
Variables from Full Year Consolidated Files currently not included in IPUMS-M...
Login   |   Register

Recently Active Members

View More »

Share |