[R] Expressing a multinomial GLM as a series of binomial GLMs

John Fox jfox at mcmaster.ca
Tue Jul 22 22:20:57 CEST 2014


Dear Christoph,

If I understand correctly what you've done, the two approaches are not equivalent and should not in general produce the same fitted probabilities.

Letting {a, b} represent logit(a vs. b) = log(Pr(a)/Pr(b)) and {ab, cd} represent logit(a or b vs. c or d), and numbering the response levels 1, 2, 3, 4, then the multinomial logit model fits the logits {2, 1}, {3, 1}, {4, 1}, while your binary logit models are for the logits {12, 34}, {23, 14}, {34, 12}. Note that the first and third are complementary, but even if you had used three distinct logits of this kind, the combined models (which BTW wouldn't be independent) would not be equivalent to the multinomial logit model.

I hope that this helps (and that I've not misconstrued what you did).

Best,
 John

------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
	
On Tue, 22 Jul 2014 16:47:17 +0200
 "Scherber, Christoph" <cscherb1 at gwdg.de> wrote:
> Dear all,
> 
> I am trying to express a multinomial GLM (using nnet) as a series of GLM models.
> 
> However, when I compare the multinom() predictions to those from GLM, I see differences that I can´t
> explain. Can anyone help me out here?
> 
> Here comes a reproducible example:
> 
> ##
> # set up data: (don´t care what they are, just for playing)
> set.seed(0)
> cats=c("oligolectic","polylectic","specialist","generalist")
> explan1=c("natural","managed")
> explan2=c("meadow","meadow","pasture","pasture")
> multicats=factor(sample(cats,replace=T,100,prob=c(0.5,0.2,0.1,0.5)))
> multiplan1=factor(rep(explan1,50))
> multiplan2=factor(rep(explan2,25))
> 
> ########################
> library(nnet)
> m2=multinom(multicats~multiplan1)
> 
> # predictions from multinomial model
> predict(m2,type="probs")
> 
> ########################
> # now set up contrasts for response variable "multicats" (which has 4 levels):
> 
> ii=as.numeric(multicats)
> 
> g1=glm(I(ii%in%c(1,2)) ~ multiplan1, family = "binomial")
> g2=glm(I(ii%in%c(2,3)) ~ multiplan1, family = "binomial")
> g3=glm(I(ii%in%c(3,4)) ~ multiplan1, family = "binomial")
> 
> r1=predict(g1,type="response")
> r2=predict(g2,type="response")
> r3=predict(g3,type="response")
> 
> # calculate predictions (based on Chapter 8.3 in Dobson 2002, Introduction to GLMs)
> ee0=1/(1+r1+r2+r3)
> ee1=r1/(1+r1)
> ee2=r2/(1+r1+r2)
> ee3=r3/(1+r1+r2+r3)
> 
> # compare predictions between GLM and multinom fits:
> apply(cbind(ee0,ee1,ee2,ee3),2,mean)
> apply(predict(m2,type="probs"),2,mean)
> 
> 
> #################
> [using R 3.1.1 on Windows 7 32-bit]
> 
> 
> 
> 
> 
> -- 
> PD Dr. Christoph Scherber
> Senior Lecturer
> DNPW, Agroecology
> University of Goettingen
> Grisebachstrasse 6
> 37077 Goettingen
> Germany
> telephone +49 551 39 8807
> facsimile +49 551 39 8806
> www.gwdg.de/~cscherb1
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list