Question

Status:Closed    Asked:Jul 25, 2018 - 10:29 PM

Top-coded variables

With the top-coded salary variable in the Higher Ed dataset, when it is said to be top-coded is it 1. anomalised and marked as missing, is 2. the observation removed from the dataset or is it 3. rounded to the mark at which it is top-coded?


Thank you!

 
Do you have the same question? Follow this Question
 

Staff Answer

avatar

Jeff Bloem

Staff

Your case number 3 is most accurate. The SALARY variable in IPUMS Higher Ed is top-coded at 150,000 US dollars (except for in the SESTAT-NSCG and NSRCG surveys where these values are top-coded at 100,000 US dollars). This means that for any SALARY value reported above the top-code, the value is replaced with the top-code value. So, for example, if someone reports a salary of 170,000 US dollars, then this value will show up as 150,000 in the data.

 

Jul 26, 2018 - 09:28 AM

1
0
Report it

Answers

Thanks for that I really appreciate it Jeff! One follow-up question, if you wouldn't mind. As you mentioned the SETAT-NSCG and NSRCG is top-coded at $100,000. Examing the data in stata in for example, the year 1995 and 2013, it shows observations from these two survey actuallly reaching up to $150,000 in salary. Do you know why this is?


Thank you!

 

Jul 26, 2018 - 05:49 PM

0
0
Report it

You are right about this! The documentation is misleading in this case. It should say that SALARY in SESTAT-NSRCG surveys are top-coded at 100,000 US dollars and all other surveys are top-coded at 150,000 US dollars. Sorry for the confusion here. We will update the documentation appropriately.

 

Jul 27, 2018 - 08:57 AM

0
0
Report it

Apologies, but even in SESTAT-NSRCG surveys, it seems observations are top-coded at $150,000? It seems all surveys include observations up to $150,000.



 

Jul 27, 2018 - 09:18 AM

0
0
Report it

I don't think so. After cleaning out observations with special codes for skips and missing, I find the following:

. by surid, sort : summarize salary
-------------------------------------------------------------------------------
----------------------
-> surid = NSCG

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
salary | 457043 65154.34 36686.89 0 150000

-------------------------------------------------------------------------------
----------------------
-> surid = NSRCG

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
salary | 90087 36660.75 20529.4 0 100000

Perhaps I am missing something?

 

Jul 27, 2018 - 09:33 AM

0
0
Report it

Apologies, you're right, sorry!

 

Jul 27, 2018 - 09:51 AM

0
0
Report it

OTHER QUESTIONS NEEDING ANSWERS

I am using incwage variable from ACS yearly survey (2001-2007). Is it top cod...
Does there exist "swap values" for top-coded income variables?
How does the top-coding of individual income variables affect aggregate varia...
Regarding Top-Coded and Replacement Values for 1990 HHINCOME
Login   |   Register

Recently Active Members

View More »

Share |