Are the variable names in the Stata .dta file upper case or lower case?

Dear Folks–

The IPUMS-CPS variable names as they appear on the IPUMS-CPS web pages, in the .dat fixed-width files, in the .csv files, in the codebook, and in the xtml documentation are upper case. The variable names as they appear in the Stata .do file are lower case.

The R package “haven” converts .dta files to R data frames. R is case sensitive: HHINCOME does not equal hhincome. The variable names in the data frame as it emerges from haven are lower case. Is that because haven is converting them to lower case? (It shouldn’t). Or is it because you are sending the .dta files out with a different case than other files? And if the latter, was that intentional (and if so, why?) or unintensional (and if so, will you be changing it any time soon?).

Warmest regards, Andrew

The lowercase variable names in Stata extracts and syntax files is intentional. This is because the Stata convention is to name variables using all lowercase characters. While this is not enforced by the Stata program it is common practice among most users so this is what we support.

I hope this helps.