Status:Closed    Asked:Mar 10, 2017 - 04:05 PM

Contrary to documentation does EARNWEEK have two implied decimal points?

I have extracted EARNWEEK in a sample of data from 2012-2016. I am using R to read in the data, my read_fwf command is as follows:

raw <- read_fwf(

file = paste0(input, "/cps_00005.dat"),

fwf_widths(c(4, 5, 10, 14, 2, 2, 1, 1, 4, 5, 4, 2, 2, 14, 10, 10, 1, 4,

2, 1, 3, 1, 3, 2, 2, 4, 4, 2, 3, 3, 4, 1, 7, 8, 2, 1, 1, 1),

c("year", "serial", "hwtsupp", "cpsid", "region", "statefip",

"asecflag", "hflag", "metarea", "county", "cpi99", "month",

"pernum", "cpsidp", "wtsupp", "earnwt", "nchild", "relate",

"age", "sex", "race", "marst", "hispan", "educ99",

"empstat", "occ", "ind", "classwkr", "uhrsworkt", "uhrswork1",

"hourwage","union", "incwage", "earnweek", "wkstat",

"qempstat", "qocc", "qearnwee")),

col_types = "iciciiiicciicciiiiiiiiciiiiiiiiiiiiiii")

Documentation for EARNWEEK reads as follows:

"The values in EARNWEEK are in dollars, with no implied decimal places; a value of 500 means that the respondent earned five hundred dollars per week before deduction."

However, missing values are coded with two decimal points:


9999.99 = N.I.U. (Not in Universe).

Top codes:

1990-1997: 1923 (Weekly earnings of $1923 or more).
1998-onward: 2885 (Weekly earnings of $2885 or more: ASEC samples only). 2884.61 for non-ASEC samples.

Within R I estimate the minimum value of raw$earnweek as 100 and the maximum value as 999999. The second highest value appears to be the top code with two implied decimals: 288500. Am I missing something or does the documentation need to be corrected? Thanks for your time and all that you do!



Do you have the same question? Follow this Question

Staff Answer


Jeff Bloem


Although there are no implied decimal places in the EARNWEEK variable there are actual decimals in the data. For most of these codes (i.e. all but the NIU code) the decimal places are filled with zeros. Perhaps the the documentation should be updated to make this clearer. It seems that when this data gets read into R, that the decimal point is somehow dropped. So the records with the code 288500 really is identifying the topcode of $2885 or more per week.

I hope this helps.


Mar 22, 2017 - 09:27 AM

Report it


negative values of wtsupp 1975 and before
Change in sample size in 94 and 95 when linking individuals with non-missing ...
Why are not all the "not in the labour force" categories filled for the EMPST...
I am getting strange numbers for NIU persons in the sixties, especially 1962-...
Login   |   Register

Recently Active Members

View More »

Share |