[R-sig-ME] Multilevel logistic regression

Sun Mar 22 05:20:59 CET 2009

Thierry et al:

I think I solved my problem and it provides some insight into the way lmer handles binomial data, so I will share the findings here. First, I have made two datasets available on the web, a long format and a wide format version of the "metaresp" data of Joop Hox..  They can be found here

http://www.duke.edu/~bi6/

Hox has a draft of the chapter of interest that discusses the metaresp dataset and the modeling process/problem that I am trying to solve.  Note that the results that I am trying to reproduce are in Tables 6.3 and 6.4.  The chapter can be found at:

http://www.geocities.com/joophox/papers/chap6.pdf

Now here are the models that I fit with lmer.  Assume that the wide version of the data is called "wide" and the long version "long".
-----------------------------

y <- cbind(wide$SUCCESS, wide$FAIL)

f1 <- lmer(y ~ RESPISRR + (1 | SOURCE), family=binomial, data=wide)
summary(f1)

f2 <- lmer(y ~ RESPISRR + TELDUM + MAILDUM + (1 | SOURCE), family=binomial, data=wide)
summary(f2)

f3 <- lmer(y ~  ~ RESPISRR + TELDUM + MAILDUM + (1 + TELDUM + MAILDUM | SOURCE), 
                family=binomial, data=wide)
summary(f3)

f4 <- lmer(SUCCESS ~ RESPISRR + TELDUM + MAILDUM + (1 + TELDUM + MAILDUM | SOURCE), 
		family=binomial, data=long)
summary(f4)

-------------------------------

Models f1, f2, and f4 work and reproduce the results of Hox.  Model f4 takes a hell of a long time to compute, but it seems to give the expected results.  Model f3, which I assumed (wrongly) would be the same as f4, does not work.  Instead, when it is run, you get the error message:

> Error in mer_finalize(ans) : q = 240 > n = 105

I guess the question that I have now is: did I do something wrong with model f3 or is lmer doing something unusual?  My assumption that models f3 and f4 were the same comes from MASS4 p190 where Ripley describes the glm function for logistic regression.

I very much appreciate any insight.

Brant

#####################################################################################

On Friday, March 20, 2009, at 04:32AM, "ONKELINX, Thierry" <Thierry.ONKELINX at inbo.be> wrote:
>Dear Brant,
>
>The model is too complex. You have maximum three observations for each
>level of the random effect. Allowing for a random intercept and two
>random slopes does not make much sense then. Does it?
>
>HTH,
>
>Thierry
>
>
>------------------------------------------------------------------------
>----
>ir. Thierry Onkelinx
>Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>and Forest
>Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
>methodology and quality assurance
>Gaverstraat 4
>9500 Geraardsbergen
>Belgium 
>tel. + 32 54/436 185
>Thierry.Onkelinx at inbo.be 
>www.inbo.be 
>
>To call in the statistician after the experiment is done may be no more
>than asking him to perform a post-mortem examination: he may be able to
>say what the experiment died of.
>~ Sir Ronald Aylmer Fisher
>
>The plural of anecdote is not data.
>~ Roger Brinner
>
>The combination of some data and an aching desire for an answer does not
>ensure that a reasonable answer can be extracted from a given body of
>data.
>~ John Tukey
>
>-----Oorspronkelijk bericht-----
>Van: r-sig-mixed-models-bounces at r-project.org
>[mailto:r-sig-mixed-models-bounces at r-project.org] Namens Brant Inman
>Verzonden: donderdag 19 maart 2009 3:11
>Aan: r-sig-mixed-models at r-project.org
>Onderwerp: [R-sig-ME] Multilevel logistic regression
>
>
>lmer Experts:
>
>I am trying to use lmer to duplicate the results found in Joop Hox's  
>book "Multilevel Analysis: technique and applications" 2002.  In  
>chapter 6 of his book he shows an example of multilevel logistic  
>regression for a meta-analysis of survey response rates.  The data are  
>available in the file "metaresp.xls" at his website:
>
><http://www.geocities.com/joophox/mlbook/excelxls.zip>
>
>The dataset includes the following variables of interest:
>
>Individual level (Level 1) variables:
>TELDUM	 = telephone questioning
>MAILDUM  = mail questioning
>RESPONSE = the outcome of interest, the study response rate
>DENOM    = the number of people questioned
>
>Study/group level (Level 2) variables:
>SOURCE 	 = the study identifier
>YEAR	 = year of study
>SALIENCY = how salient the questionnaire was (0 to 2)
>RESPISRR = the way the response rate was calculated
>
>
>The null model (Table 6.2) proposed by Joop is easy to fit:
>
>SUCCESS <- as.integer(RESPISRR*DENOM)
>y  	<- cbind(SUCCESS, DENOM-SUCCESS)
>
>f1 <- lmer(y ~ RESPISRR + (1 | SOURCE), family=binomial(link=logit))
>
>
>Joop then adds a couple Level 1 variables (Table 6.3):
>
>f2 <- lmer(y ~ RESPISRR + TELNUM + MAILDUM + (1 | SOURCE),  
>family=binomial(link=logit))
>
>
>He then says that these two Level 1 variables should be allowed to  
>vary across studies (varying slopes).  When I try to fit what I  
>believe to be the correct model, I get an error
>
>
>f3 <- lmer(y ~ RESPISRR + TELNUM + MAILDUM + (TELNUM | SOURCE) +  
>(MAILDUM | SOURCE)
>	+ (1 | SOURCE), family=binomial(link=logit))
>
>Error in mer_finalize(ans) : q = 240 > n = 105
>
>
>Can anyone tell me what I am doing wrong here?  Thanks so much in  
>advance.
>
>Brant Inman
>Duke University Medical Center
>
>	[[alternative HTML version deleted]]
>
>_______________________________________________
>R-sig-mixed-models at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
>en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
>door een geldig ondertekend document. The views expressed in  this message 
>and any annex are purely those of the writer and may not be regarded as stating 
>an official position of INBO, as long as the message is not confirmed by a duly 
>signed document.
>
>