"Backed up" message/error!

Im using the following command: probit ownershp age marst race educ inctot

Where “inctot” (total income) = 9999999 when it is N/A

Because of this, when running probit, inctot skews the accuracy of the data.

I used the following command: drop inctot if == 9999999

But, as soon as i use that command, and try to use the probit command again (probit ownershp age marst race educ inctot). I get a “backed up” message.

Iteration 0: log likelihood = -50470.888

Iteration 1: log likelihood = -43215.584

Iteration 2: log likelihood = -40111.231

Iteration 3: log likelihood = -39706.314

Iteration 4: log likelihood = -39685.648

Iteration 5: log likelihood = -39684.552

Iteration 6: log likelihood = -39684.523

Iteration 7: log likelihood = -39684.523 (backed up)

Iteration 8: log likelihood = -39684.523 (backed up)

Iteration 9: log likelihood = -39684.523 (backed up)

Iteration 10: log likelihood = -39684.523 (backed up)

As long as i keep STATA open, the iterations continue with the “bakced up” message. The probit command works before droping inctot == 9999999, but as soon as i run that command, the “backed up” message starts appearing. Any help is extremely appreciated.

I belielve your estimation has 2 issues.

First, most of your variables are categorical: they should be treated as such (specifying i. before the variable).

Second, because they are categorical with potentially some very small categories, you can have issues of quasi perfect prediction, as is common with maximum likelihood. For instance, if you combine all the subgroups formed by marst race and educ, then: all observations in some subgroups may have the same value for ownershp; or some subgroups will be empty. This tends to prevent the maximum likelihood algorithlm to converge smoothly because there is not enough variation in the data for what maximum likelihood is trying to estimate.

I guess the issue does not arise when you do not drop 9999999 values on inctot just because this gives you more observations in all your subgroups.

So in conclusion, the issue is not inctot here but rather the fact of being very careful to have balanced categories for all of your categorical variables (including ownershp).

A probit regression in Stata considers values of zero as the “negative” outcome and all other non-missing values as the “positive” outcome. For your dependent variable, OWNERSHP, values of zero are defined as “Not In Universe” (i.e. group quarters). Instead, you should recode OWNERSHP so that either Rent or Owned is equal to zero. Also, as Geoffrey stated, you should identify categorical variables with “i.”:

probit ownershp age inctot i.marst i.race i.educ

After dropping values of INCTOT=9999999, recoding OWNERSHP so that Owned equals zero, and identifying categorical variables, the probit command converges on a solution for your most recent data extract.

Hope this helps.