Status:Closed    Asked:Feb 13, 2016 - 12:39 PM

Stata svyset - General Household Survey 2006-2010 in Nigeria

I want to run a regression at household level for a dataset from the subject. However, I have doubts about correct declaration of survey design in Stata with svyset for the dataset. Is this correct:

1. To use household weights (HHWT) as pweights.

2. To use state variable (GEO1_NG) to identify strata.

3. What to use as PSU since sample description states that PSU is enumerated unit and there is no variable to identify this?

Do you have the same question? Follow this Question

Staff Answer




I first recommend that you read the IPUMS-I User Note on "Sampling Error and Variance Estimation" and refer to the bottom of the Sample Design Summary page. The Nigeria samples are drawn by complex stratification with geographic clustering and household clustering. As a result, you should use the household weight (HHWT) as your pweight, use Enumeration Area as your strata, and cluster by household (SERIAL).

As you note, Enumeration Area is not available in all of the Nigerian samples. Not accounting for geographic clustering in these samples will lead to underestimating standard errors, which means you should use caution in interpreting statistical significance at the margins. You might also consider investigating the effect on your standard errors of using State as your strata in the samples without Enumeration Area provided.

Hope this helps.


Feb 18, 2016 - 11:26 AM

Report it


I have noticed that among unharmonized variables there is a variable identifying enumeration area(s). As a result, would this be correct:

svyset ea_code [pw=hhwt]

In addition to that, e.g. for year 2006 and 2010 there is no such variable. I assume thah in such case I should use only hhwt when performing regression analysis?

Thank you in advance.


Feb 17, 2016 - 10:01 AM

Report it


I am looking for the Kessler 6 variable in IPUMS-MEPS
Reason for missing MIGSTA1 values in ASEC files for 1985 (and 1995)
India Area-level data (geo2) inconsistent for total population, urban share, ...
Does anyone have a method for handling missing data in CPS "EDUC" variable? 2...
How do I find the data file:Ipumsi_00001.dat for the 1974
Variables from Full Year Consolidated Files currently not included in IPUMS-M...
Login   |   Register

Recently Active Members

View More »

Share |