Status:Closed    Asked:Aug 05, 2015 - 05:47 PM

Is there an issue with the WTFINL variable in the CPS monthly data from 2000-2004?

When I try to calculate the proportion of males by state and fiscal year quarters for people older than 17, I see a greater variation in the estimates from 2000-2004 that goes away in 2005-2014. I am aware of the issue with the WTFINL variable in April and June 2001 ( but am not sure that is the only issue affecting my results. Code below generates what I am seeing. Need age, gender, hispanic, year, month, and wtfinl

* setwd

* load cps data

* FY Quarter model

* Quarter 1 is Oct, Nov, and Dec

gen quarter = 0

* Quarter 2 is Jan, Feb, and Mar

replace quarter = 1/4 if month == 1 | month == 2 | month == 3

* Quarter 3 is Apr, May, and June

replace quarter = 2/4 if month == 4 | month == 5 | month == 6

* Quarter 4 is July, Aug, and Sep

replace quarter = 3/4 if month == 7 | month == 8 | month == 9

gen year_quarter = year + quarter

replace year_quarter = year_quarter+1 if quarter ==0

drop if age < 17

gen population = 1

gen male = sex==2

gen hispanic = hispan != 0

keep male black hispanic population wtfinl statefip year_quarter

collapse (mean) male hispanic (count) population [pweight = wtfinl], by(statefip year_quarter)

* Most prominent in proportion of males

twoway (line male year_quarter), by(statefip)

* To a lesser, but noticeably, extent proportion hispanic

twoway (line hispanic year_quarter), by(statefip)

* Sharp dip in population circa 2001. Most noticeable in California

twoway (line population year_quarter), by(statefip)

Do you have the same question? Follow this Question

Staff Answer




I was able to replicate the additional variability pre-2005; however, this does not appear to be an issue with the WTFINL variable. While the CPS can be used for state-level analyses, researchers should still proceed with caution and be aware of the large standard errors due to small sample sizes (especially with sub-populations such as male or Hispanic). Additionally, the Census Bureau implemented improvements to the second-stage weighting and composite weighting procedures. These improvements are known to increase the stability of state demographic estimates across time. These changes could also be responsible for the lower variance you are seeing in the more recent samples.

Also note that males correspond to SEX=1. It appears you mistakenly coded males as respondents with SEX=2 in your sample code.

Hope this helps.


Aug 14, 2015 - 01:04 PM

Report it


CPS Small Sample Size April and June 2001: using stata [pweight=wtfinl] - sud...
Why are not all the "not in the labour force" categories filled for the EMPST...
I am getting strange numbers for NIU persons in the sixties, especially 1962-...
Why the 1910 US census does not have information on group quarter status?
Login   |   Register

Recently Active Members

View More »

Share |