Question

Status:Closed    Asked:Mar 10, 2017 - 04:05 PM

Contrary to documentation does EARNWEEK have two implied decimal points?

I have extracted EARNWEEK in a sample of data from 2012-2016. I am using R to read in the data, my read_fwf command is as follows:


raw <- read_fwf(

file = paste0(input, "/cps_00005.dat"),

fwf_widths(c(4, 5, 10, 14, 2, 2, 1, 1, 4, 5, 4, 2, 2, 14, 10, 10, 1, 4,

2, 1, 3, 1, 3, 2, 2, 4, 4, 2, 3, 3, 4, 1, 7, 8, 2, 1, 1, 1),

c("year", "serial", "hwtsupp", "cpsid", "region", "statefip",

"asecflag", "hflag", "metarea", "county", "cpi99", "month",

"pernum", "cpsidp", "wtsupp", "earnwt", "nchild", "relate",

"age", "sex", "race", "marst", "hispan", "educ99",

"empstat", "occ", "ind", "classwkr", "uhrsworkt", "uhrswork1",

"hourwage","union", "incwage", "earnweek", "wkstat",

"qempstat", "qocc", "qearnwee")),

col_types = "iciciiiicciicciiiiiiiiciiiiiiiiiiiiiii")


Documentation for EARNWEEK reads as follows:


"The values in EARNWEEK are in dollars, with no implied decimal places; a value of 500 means that the respondent earned five hundred dollars per week before deduction."


However, missing values are coded with two decimal points:


Codes

9999.99 = N.I.U. (Not in Universe).

Top codes:

1990-1997: 1923 (Weekly earnings of $1923 or more).
1998-onward: 2885 (Weekly earnings of $2885 or more: ASEC samples only). 2884.61 for non-ASEC samples.

Within R I estimate the minimum value of raw$earnweek as 100 and the maximum value as 999999. The second highest value appears to be the top code with two implied decimals: 288500. Am I missing something or does the documentation need to be corrected? Thanks for your time and all that you do!

Best,

Lowell

 
Do you have the same question? Follow this Question
 

Staff Answer

avatar

Jeff Bloem

Staff

Although there are no implied decimal places in the EARNWEEK variable there are actual decimals in the data. For most of these codes (i.e. all but the NIU code) the decimal places are filled with zeros. Perhaps the the documentation should be updated to make this clearer. It seems that when this data gets read into R, that the decimal point is somehow dropped. So the records with the code 288500 really is identifying the topcode of $2885 or more per week.

I hope this helps.

 

Mar 22, 2017 - 09:27 AM

0
0
Report it

OTHER QUESTIONS NEEDING ANSWERS

negative values of wtsupp 1975 and before
MPC Note: Programming error applying inflation adjustment to multi-year ACS f...
Change in sample size in 94 and 95 when linking individuals with non-missing ...
Can I merge basic monthly data with March supplement?
Why are not all the "not in the labour force" categories filled for the EMPST...
Login   |   Register


Share |