Question

Status:Closed    Asked:Oct 02, 2018 - 03:25 PM

How can I use replicate weights to create standard errors in R?

I am using R to analyze CPS data on household income and would like to use the replicate weights to create standard errors.


I am aware that such a code exists in STATA and other statistical software but am having issues translating this to R.

 
Do you have the same question? Follow this Question
 

Answers

Note that the CPS weighting system has changed a little bit this year, and not all of our documentation has been updated. There used to be just one variable name across all supplements (WTSUPP), but now the variable name depends on which supplement you are using. Here are some examples using ASEC data (and so use ASECWT), but you can see the chart here to see what variable you should use:

https://cps.ipums.org/cps/weights_ren...


And here are some examples using first the survey package, and then the srvyr (which is based on survey, but uses dplyr syntax).



library(ipumsr)

library(dplyr)


# Read data and some light data formatting

data <- read_ipums_micro("cps_00021.xml")

#> Use of data from IPUMS-CPS is subject to conditions including that users should

#> cite the data appropriately. Use command `ipums_conditions()` for more details.


data <- data %>%

mutate(

AGE = as.numeric(AGE),

SEX = as_factor(SEX),

INCTOT = as.numeric(lbl_na_if(INCTOT, ~.val >= 99999990))

)


## R (survey package) -----


# If not installed already: install.packages("survey")

library(survey)

svy <- svrepdesign(data = data, weight = ~ASECWT, repweights = "REPWTP[0-9]+", type = "JK1", scale = 4/60, rscales = rep(1, 160), mse = TRUE)


# Calculate mean of INCTOT

svymean(~INCTOT, svy, na.rm = TRUE)

#> mean SE

#> INCTOT 42526 383.64


# Calculate a mean of INCTOT, on the subset of people aged 25-64

svy_subset <- subset(svy, AGE >=25 & AGE < 65)

svymean(~INCTOT, svy_subset, na.rm = TRUE)

#> mean SE

#> INCTOT 51407 496.95


# Calculate the mean of INCTOT by SEX

svyby(~INCTOT, ~SEX, svy, svymean, na.rm = TRUE)

#> SEX INCTOT se

#> Male Male 53196.41 637.2199

#> Female Female 32456.95 325.3275




# R (srvyr package - uses dplyr-like syntax) -----


# If not installed already: install.packages("srvyr")

library(srvyr)

svy <- as_survey(data, weight = ASECWT, repweights = matches("REPWTP[0-9]+"), type = "JK1", scale = 4/60, rscales = rep(1, 160), mse = TRUE)


# Calculate mean of INCTOT

svy %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 1 x 2

#> mn mn_se

#> <dbl> <dbl>

#> 1 42526. 384.


# Calculate a mean of INCTOT, on the subset of people aged 25-64

svy %>%

filter(AGE >= 25 & AGE < 65) %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 1 x 2

#> mn mn_se

#> <dbl> <dbl>

#> 1 51407. 497.


# Calculate the mean of INCTOT by SEX

svy %>%

group_by(SEX) %>%

summarize(mn = survey_mean(INCTOT, na.rm = TRUE))

#> # A tibble: 2 x 3

#> SEX mn mn_se

#> <fct> <dbl> <dbl>

#> 1 Male 53196. 637.

#> 2 Female 32457. 325.


 

Oct 05, 2018 - 09:09 AM

0
0
Report it

OTHER QUESTIONS NEEDING ANSWERS

Does it matter if I use replicate weights in IPUMS-CPS?
Are there replicate weights or is there a strata variable for the October CPS...
Should standard errors computed with ATUS replicate weights be expected to ma...
Should I use person weight (PERWT) or replicate weights (REPWTP) for calculat...
Login   |   Register

Recently Active Members

View More »

Share |