Extracting multiple IHIS years at once, and separating by year, or extracting multiple years individually?

Hi,

I am working with IHIS data from 1997-2013, constructing separate, individual data files by year (e.g. one file for 1997, one for 1998, etc. through 2013). All of these files will contain the same set of variables, though certain variables will be unavailable in a given year depending on whether or not the question was asked or how the responses were coded (e.g. total family income categories change for 2006 onward).

Is it safe to assume that I can extract the full set of data for 1997-2013, and then use STATA to create these separate files by year using syntax like “keep if year==1997”? This would be in contrast to extracting data for each year, one at a time. I just wanted to check and see if there was any substantive difference that I might be overlooking between these two methods, especially with respect to the process by which IHIS pools and harmonizes data.

Thanks in advance for your time!

Yes, your assumption is safe. Either process for extracting data will provide identical data. There should be no substantive or meaningful difference between extracting multiple samples at once in one extract or extracting multiple years individually in multiple extracts.

Thanks so much!