[R] Factors and Multinomial Logistic Regression

William Dunlap wdunlap at tibco.com
Wed May 1 23:20:39 CEST 2013


Look at the output of str(mydata)
   'data.frame':   200 obs. of  11 variables:
    $ id     : num  70 121 86 141 172 113 50 11 84 48 ...
    $ female : Factor w/ 2 levels "male","female": 1 2 1 1 1 1 1 1 1 1 ...
    $ race   : Factor w/ 4 levels "hispanic","asian",..: 4 4 4 4 4 4 3 1 4 3 ...
    $ ses    : Factor w/ 3 levels "low","middle",..: 1 2 3 3 2 2 2 2 2 2 ...

mydata$female is a factor with levels "male" and "female", in that order.

You constructed mydata$sex so that it has values 0 for female and 1 for male,
opposite the order for mydata$female.
   > with(mydata, table(female, sex))
           sex
   female     0   1
     male     0  91
     female 109   0

Thus your coefficients ought to be different but fitted values ought to be the same.
You can make the coefficients the same by reversing the order of the levels of
female or by using contr.SAS for the contrast function for that variable.

(I would have expected the variable "female" to be a logical, or maybe a numeric 0/1,
and the equivalent "sex" to be the factor.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Lorenzo Isella
> Sent: Wednesday, May 01, 2013 1:40 PM
> To: peter dalgaard
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Factors and Multinomial Logistic Regression
> 
> 
> >
> > (A) The example doesn't run for me. library(ares) is not available on
> > current R versions, but even where it is available, it doesn't provide a
> > multinom() function?
> 
> 
> Apologies, ares is not needed at all. Please find the correct script at
> the end of the email.
> 
> >
> > (B) If I insert library(nnet), to get a multinom(), I get exactly the
> > same result as Stata does!
> >
> > Did you by any chance diddle with options(contrasts=...)?
> >
> > -pd
> 
> No, I did not. The point is that if I use a variable female, which has two
> levels, then I do not reproduce the results of stata for that variable
> only.
> If instead I define a variable "sex" which assumes the numerical values
> 0/1, then I reproduce entirely the results by stata.
> 
> Hope this helps.
> 
> Lorenzo
> 
> 
> ##################################################################
> 
> 
> library(foreign)
> 
> ## See the Stata example at http://bit.ly/11VG4ha
> 
> mydata <- read.dta("http://www.ats.ucla.edu/stat/data/hsb2.dta")
> 
> 
> sex <- rep(0, dim(mydata)[1])
> 
> sel <- which(mydata$female=="male")
> 
> sex[sel] <- 1
> 
> mydata$sex <- sex
> 
> ## IMPORTANT: redefine the base line!!!
> 
> mydata$ses2 <- relevel(mydata$ses, ref = "middle")
> 
> 
> ## NB: for some reason, if I use female (a factor assuming two values)
> ## I do not reproduce the results of the example.
> ## I need to use a variable which is numeric and assumes two values
> ## (that is why I introduced the variable sex))
> 
> ## mymodel <- multinom(ses2 ~ science+ socst+ sex, data=mydata)
> 
> 
> mymodel <- multinom(ses2 ~ science+ socst+ female, data=mydata)
> 
> 
> 
> 
> print(summary(mymodel))
> 
> print("The relative risk ratio (RRR) is, ")
> 
> print(exp(coef(mymodel)))
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list