I am trying to estimate the number of foreign language speakers by county

I am trying to estimate the number of foreign language speakers by county for the State of Pennsylvania. Some counties do not appear in the IPUMS files (I had assumed because of sample size/privacy issues) but cities (such as Allentown) do appear in the 5 year files but the county for such cities is coded “0”. How can I account for these counties in the analyses?

From 1950 onwards, the public-use census files do not contain county information. As a result, IPUMS must identify counties using the lowest level of geography available. For the ACS samples, this is the PUMA. Counties are only identifiable in IPUMS if the borders of the county align with the borders of a single PUMA or if multiple PUMAs are completely contained within the borders of a single county.

You will notice in the Pennsylvania PUMA map that much of rural PA has PUMAs that extend into multiple counties. Since PUMA is the lowest identifiable level of geography, it is not possible in these situations to determine the county in which a respondent resides. As for CITY, Allentown happens to perfectly match the borders of PUMA 03600 and can therefore be completely identified.

If you need individual-level microdata, then it will not be possible to identify all counties in PA using IPUMS data. However, if you simply need aggregated county-level information, then you have a couple options. First, the NHGIS project, which is also housed in the Minnesota Population Center, offers many of the official Census tables going back to 1790 at several geographic levels. While NHGIS is primarily focused on mapping data, the summary tables can be extracted independent of shape files as .csv files with optional descriptive headers. Second, the Census’ American FactFindersite is another source for Census tables at the county level.

Hope this helps.