[R] Multinomial logistic regression under R and Stata

Mon Apr 21 17:58:59 CEST 2003

Dear Colleagues

I posted to the R-help and Stata lists a little while ago concerning 
some disagreement in results I obtained from using the multinom function 
in R and the mlogit command in Stata.

Many thanks to colleagues for your comments and ideas. I have checked 
out some of your suggestions, and here is a report. The disagreements I 
reported are of two types: (1) parameter estimates for the intercepts 
and (2) standard errors of a quadratic term of a quantitative variable.

Regarding (1): yes, as some of you suggested, this is due to the coding 
of another covariate in the model. Thanks!

As for (2), it turns out that the problem has to do with the scale of 
the quadratic term.

In my original model, I have, out of habit, scaled down the quadratic 
term by 100, so as to make its scale comparable to the linear term. I.e. 
I did the following.

varsq <- var*var/100

This is in fact unnecessary in the present case, given the scale of the 
linear term. But anyway, with the division, R and Stata disagree:

        beta        s.e.
R:      5.939880  2.920165
Stata:  5.939747  5.455495

R:      11.228705 2.191625
Stata:  11.22761  4.630293

However, if I don't divide the quadratic term by 100, then R and Stata agree.

R:       0.05939645  0.05455490
Stata:   0.0593975   0.0545549

R:       0.11227038  0.04630296
Stata:   0.1122761   0.0463029

So it apprears that R might have some precision problem in calculating 
the Hessian when the scale of a variable is very small. I talked to a 
colleague, David Firth, about this, and he suggests

 > Possibly it would be worth implementing an algebraic vcov method for 
multinomial logit models [in R]?

Once again, many thanks to colleagues for your time and help.

Cheers.  Wing

-- 
Department of Sociology, University of Oxford,
Littlegate House, St Ebbes, Oxford OX1 1PT, UK
tel: +44 (1865) 286176, fax: +44 (1865) 286171
http://users.ox.ac.uk/~sfos0006