Status:Closed    Asked:Jul 24, 2016 - 04:34 PM

Are the variable names in the Stata .dta file upper case or lower case?

Dear Folks--

The IPUMS-CPS variable names as they appear on the IPUMS-CPS web pages, in the .dat fixed-width files, in the .csv files, in the codebook, and in the xtml documentation are upper case. The variable names as they appear in the Stata .do file are lower case.

The R package "haven" converts .dta files to R data frames. R is case sensitive: HHINCOME does not equal hhincome. The variable names in the data frame as it emerges from haven are lower case. Is that because haven is converting them to lower case? (It shouldn't). Or is it because you are sending the .dta files out with a different case than other files? And if the latter, was that intentional (and if so, why?) or unintensional (and if so, will you be changing it any time soon?).

Warmest regards, Andrew

Do you have the same question? Follow this Question

Staff Answer




The lowercase variable names in Stata extracts and syntax files is intentional. This is because the Stata convention is to name variables using all lowercase characters. While this is not enforced by the Stata program it is common practice among most users so this is what we support.

I hope this helps.


Jul 25, 2016 - 02:10 PM

Report it


Why is the income wage variable incwage-capped? and how is the cap determined...
USA IPUMS 1980 ancestry variable -- mapping to country names
Did the version of database change since 2011?
Are DATANUM, SERIAL, and PERNUM the same in single-year and 5-year ACS sample...
How can I accurately calculate the median family income by state for families...
Detailed Variables Dropping From Dataset
Login   |   Register

Share |