[R-sig-ME] Getting out of using a mixed model if "nested" data points are not actually correlated?

M.S.Muller m.s.muller at rug.nl
Tue Mar 22 13:50:58 CET 2011


Dear all, 

I am trying to analyze some behavioural data that have a zero-inflated Poisson distribution. I want to use the “zeroinfl” function from the “pscl” package that allows me to analyze a Poisson process (that generates the count portion of the data) and a binomial process (that generates the excess zeros) all within the same model. The only problem is that I have a paired design: 32 individuals split into sixteen sibling pairs (each sib in a pair received a different treatment; treatment is the predictor of main interest, although I have a couple of other covariates). Ideally, I would include “sibling pair” as a random factor, but unfortunately inclusion of random factors is not possible in the “zeroinfl” function. 

Is there a way for me to demonstrate that siblings are statistically independent (e.g. by showing no correlations in
their residuals or something) and therefore justify treating them as independent data points in the model? 

Thanks in advance for your help.

Martina








On 22-03-11, r-sig-mixed-models-request at r-project.org wrote:
> 
> Send R-sig-mixed-models mailing list submissions to
> 	r-sig-mixed-models at r-project.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> or, via email, send a message with subject or body 'help' to
> 	r-sig-mixed-models-request at r-project.org
> 
> You can reach the person managing the list at
> 	r-sig-mixed-models-owner at r-project.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of R-sig-mixed-models digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: Offtopic: Why I am receiving duplicate emails? (Douglas Bates)
>    2. Re: Satterthwaite?s df in lme (Douglas Bates)
>    3. hlm (software) exploratory analysis (Sebasti?n Daza)
>    4. Re: Observation-level random effect to model	overdispersion
>       (John Maindonald)
>    5. Library (lattice) (Shankar Lanke)
>    6. Re: Library (lattice) (Joshua Wiley)
>    7. Re: generalized mixed linear models, glmmPQL and GLMER give
>       very different results that both do not fit the data well...
>       (Franssens, Samuel)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Mon, 21 Mar 2011 13:14:33 -0500
> From: Douglas Bates <bates at stat.wisc.edu>
> To: Manuel Ram?n Fern?ndez <m.ramon.fernandez at gmail.com>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Offtopic: Why I am receiving duplicate emails?
> Message-ID:
> 	<AANLkTim4A_u+uXnRi-m+W5SrNou+j1JGg_kuTprcHi26 at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> 2011/3/13 Manuel Ram?n Fern?ndez <m.ramon.fernandez at gmail.com>:
> > For acoupleofweeksI amreceivingduplicateemails fromthe list.Isthis
> > happeningtoanyone?How I canfix it?
> 
> I don't know why you are receiving duplicates.  You address above is
> subscribed.  Is it possible that you have another address subscribed
> and mail to the second address is forwarded to your gmail.com account?
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Mon, 21 Mar 2011 15:13:34 -0500
> From: Douglas Bates <bates at stat.wisc.edu>
> To: "Flores-de-Gracia, Eric" <eef201 at exeter.ac.uk>
> Cc: "r-sig-mixed-models at r-project.org"
> 	<r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] Satterthwaite?s df in lme
> Message-ID:
> 	<AANLkTineiSA2Xu2PaOzoU4AJdfSCZHaXfeXJM8JoHTDF at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> On Thu, Feb 24, 2011 at 12:05 PM, Flores-de-Gracia, Eric
> <eef201 at exeter.ac.uk> wrote:
> 
> > Hi all.
> 
> > I am just running a lme model and found some criticism regarding the fact that DF appears just as integers.
> 
> > What are the implications of this, in terms of reporting those DF and the reliability of my model when other packages like SPSS that do mixel models reports on DF based on Satterthwait?s correction?
> > My data set is unbalanced, so should I expect this to affect the calculation of DF using lme?
> 
> There are many formulas for approximation of degrees of freedom in
> linear mixed models but they are all approximations.   That is, the
> ratio of the estimate and the approximate standard error isn't
> distributed as a T so the question of which degrees of freedom works
> best is a bit vague to begin with.
> 
> Typically the exact number of degrees of freedom chosen only makes a
> difference when the sample sizes are small or the number of levels for
> one of more of the grouping factors for the random effects is small.
> In practice the difference between a T distribution with 40 degrees of
> freedom and a T distribution with 400 degrees of freedom is
> negligible.
> 
> If the degrees of freedom provided by the summary of an lme fit is
> reasonably large then the particular approximation used is not
> important.
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 21 Mar 2011 17:20:19 -0500
> From: Sebasti?n Daza <sebastian.daza at gmail.com>
> To: "r-sig-mixed-models at r-project.org"
> 	<r-sig-mixed-models at r-project.org>
> Subject: [R-sig-ME] hlm (software) exploratory analysis
> Message-ID: <4D87CF23.7010108 at gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Hi everyone,
> Does anyone know if there is a  procedure to conduct a exploratory 
> analysis as HLM performs? Exploratory analysis is defined as "estimated 
> level-2 coefficients and their standard errors obtained by regressing EB 
> residuals on level-2 predictors selected for possible inclusion in 
> subsequent HLM runs".
> 
> Thank you in advance.
> -- 
> Sebasti?n Daza
> sebastian.daza at gmail.com
> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Tue, 22 Mar 2011 09:23:07 +1100
> From: John Maindonald <john.maindonald at anu.edu.au>
> To: "M.S.Muller" <m.s.muller at rug.nl>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Observation-level random effect to model
> 	overdispersion
> Message-ID: <E50E2806-9231-41BC-91E3-5FE147A2379B at anu.edu.au>
> Content-Type: text/plain; charset=us-ascii
> 
> One point, additional to other responses.  There is just one 
> "miniature variance component" (by the way, not miniature 
> in the sense that it has to be small).  As I understand it, a 
> normal distribution with this variance generates one random 
> effect, on the scale of the linear predictor, for each observation.  
> On the scale of the response, well . . .
> 
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> http://www.maths.anu.edu.au/~johnm
> John Maindonald             email: john.maindonald at anu.edu.au
> phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> http://www.maths.anu.edu.au/~johnm
> 
> On 21/03/2011, at 10:51 PM, M.S.Muller wrote:
> 
> > Dear all,
> > 
> > I'm trying to analyze some strongly overdispersed Poisson-distributed data using R's mixed effects model function "lmer". Recently, several people have suggested incorporating an observation-level random effect, which would model the excess variation and solve the problem of underestimated standard errors that arises with overdispersed data. It seems to be working, but I feel uneasy using this method because I don't actually understand conceptually what it is doing. Does it package up the extra, non-Poisson variation into a miniature variance component for each data point? But then I don't understand how one ends up with non-zero residuals and why one can't just do this for any analyses (even with normally-distributed data) in which one would like to reduce noise.
> > 
> > I may be way off base here, but does this approach model some kind of mixture distribution that's a combination of Poisson and whatever distribution the extra variation is? I've read that people often use a negative binomial distribution (aka Poisson-gamma) to model overdispersed count data in which they assume that the process is Poisson (so they use a log link) but the extra variation is a gamma distribution (in which variance is proportional to square of the mean). The frequently referred to paper by Elston et al (2001) describes modeling a Poisson-lognormal distribution in which overdispersion arises from errors taking on a lognormal distribution. Is the approach of using the observation-level random effect doing something similar, and simply assuming some kind of Poisson-normal mixed distribution? Does this approach therefore assume that the observation-level variance is normally distributed?
> > 
> > If anyone could give me any guidance on this, I would appreciate it very much.
> > 
> > Martina Muller
> > 
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Mon, 21 Mar 2011 23:16:53 -0400
> From: Shankar Lanke <shankarlanke at gmail.com>
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Library (lattice)
> Message-ID:
> 	<AANLkTim-Xz+T9oE2WYHQyUw9pgwLt-XgDRp+gm2+e_fu at mail.gmail.com>
> Content-Type: text/plain
> 
> >
> > Dear All,
> >
> 
> 
> > I am looking for a code to plot X and Multiple Y scatter plot.(Table
> > below). I used the following R-Code ,however I am unable to figure out how
> > to plot lines plot for Y1 and show the ID number on top of each plot.
> >
> 
> > library("lattice")
> > xyplot(log(Y1)+log(Y2)~X|ID,layout=c(0,4))
> 
> >
> > ID  X    Y1  Y2
> > 1  0.5  0.4   1
> > 1  0.2  0.1   2
> > 1  1      2    3
> > 1  4      5    4
> > 2  2.5   3    5
> > 2  3      4    6
> > 2  0.5  0.4   7
> > 2  0.2  0.1   8
> > 2   1      2   9
> > 2   4      5   8
> > 2  2.5   3    7
> > 2  3      4    6
> > 3  0.5  0.4  5
> > 3  0.2  0.1  4
> > 3  1      2
> > 3   4      5
> > 3  2.5   3
> > 3  3      4
> >
> > Thank you very much for your suggestions.
> >
> 
> Regards,
> Shankar Lanke Ph.D.
> University at Buffalo
> Office # 716-645-4853
> Fax # 716-645-2886
> Cell # 678-232-3567
> 
> 	[[alternative HTML version deleted]]
> 
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Mon, 21 Mar 2011 21:02:54 -0700
> From: Joshua Wiley <jwiley.psych at gmail.com>
> To: Shankar Lanke <shankarlanke at gmail.com>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Library (lattice)
> Message-ID:
> 	<AANLkTimQ+QOTKqksosyRO7cY0Ah52gyL0DS6HOxRt05A at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
> 
> Dear Shankar,
> 
> It seems like this is more of a question for R-help than mixed models,
> but in any case, I think this does what you want.  See comments in the
> code for explanation.
> 
> 
> HTH,
> 
> Josh
> 
> ############################
> ## Only using part of your data
> dat <- read.table(textConnection("
> ID  X    Y1  Y2
> 1  0.5  0.4   1
> 1  0.2  0.1   2
> 1  1      2    3
> 1  4      5    4
> 2  2.5   3    5
> 2  3      4    6
> 2  0.5  0.4   7
> 2  0.2  0.1   8
> 2   1      2   9
> 2   4      5   8
> 2  2.5   3    7
> 2  3      4    6
> 3  0.5  0.4  5
> 3  0.2  0.1  4"), header = TRUE)
> closeAllConnections()
> 
> ## Sorting will making the lines prettier
> dat <- dat[order(dat$ID, dat$X), ]
> ## Convert ID to a factor to show its labels
> dat$ID <- factor(dat$ID)
> 
> require(lattice)
> ## type = "l" specifies lines rather than points
> xyplot(log(Y1) + log(Y2) ~ X | ID, data = dat, type = "l",
>   layout=c(0,4))
> ############################
> 
> On Mon, Mar 21, 2011 at 8:16 PM, Shankar Lanke <shankarlanke at gmail.com> wrote:
> >>
> >> Dear All,
> >>
> >
> >
> >> I am looking for a code to plot X and Multiple Y scatter plot.(Table
> >> below). I used the following R-Code ,however I am unable to figure out how
> >> to plot lines plot for Y1 and show the ID number on top of each plot.
> >>
> >
> >> library("lattice")
> >> xyplot(log(Y1)+log(Y2)~X|ID,layout=c(0,4))
> >
> >>
> >> ID ?X ? ?Y1 ?Y2
> >> 1 ?0.5 ?0.4 ? 1
> >> 1 ?0.2 ?0.1 ? 2
> >> 1 ?1 ? ? ?2 ? ?3
> >> 1 ?4 ? ? ?5 ? ?4
> >> 2 ?2.5 ? 3 ? ?5
> >> 2 ?3 ? ? ?4 ? ?6
> >> 2 ?0.5 ?0.4 ? 7
> >> 2 ?0.2 ?0.1 ? 8
> >> 2 ? 1 ? ? ?2 ? 9
> >> 2 ? 4 ? ? ?5 ? 8
> >> 2 ?2.5 ? 3 ? ?7
> >> 2 ?3 ? ? ?4 ? ?6
> >> 3 ?0.5 ?0.4 ?5
> >> 3 ?0.2 ?0.1 ?4
> >> 3 ?1 ? ? ?2
> >> 3 ? 4 ? ? ?5
> >> 3 ?2.5 ? 3
> >> 3 ?3 ? ? ?4
> >>
> >> Thank you very much for your suggestions.
> >>
> >
> > Regards,
> > Shankar Lanke Ph.D.
> > University at Buffalo
> > Office # 716-645-4853
> > Fax # 716-645-2886
> > Cell # 678-232-3567
> >
> > ? ? ? ?[[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> 
> 
> 
> -- 
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
> 
> 
> 
> ------------------------------
> 
> Message: 7
> Date: Tue, 22 Mar 2011 08:24:51 +0000
> From: "Franssens, Samuel" <Samuel.Franssens at econ.kuleuven.be>
> To: "r-sig-mixed-models at r-project.org"
> 	<r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] generalized mixed linear models, glmmPQL and
> 	GLMER give very different results that both do not fit the data
> 	well...
> Message-ID:
> 	<B12069DAD987FC4FB5B3B08B752F8E430144873A at ECONMBX2A.econ.kuleuven.ac.be>
> 	
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Thanks a lot for taking the time to help. At least now I know I am not doing something silly. I re-analyzed data from an earlier experiment, where we did a wrong analysis (treating proportions of correct responses per subject as continuous variable, and then doing a repeated measures anova, without correction; http://ppw.kuleuven.be/reason/wim/data/COGNIT_2010_correctedproofs.pdf , exp1). In this experiment, we used the same design, but there was no power manipulation (so you have the same 8 reasoning problems per person, two types of problem, conflict & no conflict, but no between subjects variable). Here, glmer gives very nice results. 
> 
> I look forward to reading a chapter in your book on GLMMs, as this is a very common design in (social) psychology (and also very often the wrong analysis is done, cf. above   :-)   )
> 
> Sam.
> 
> -----Original Message-----
> From: dmbates at gmail.com [mailto:dmbates at gmail.com] <dmbates at gmail.com]> On Behalf Of Douglas Bates
> Sent: Monday 21 March 2011 6:36 PM
> To: Franssens, Samuel
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] generalized mixed linear models, glmmPQL and GLMER give very different results that both do not fit the data well...
> 
> On Sat, Mar 19, 2011 at 6:17 AM, Franssens, Samuel <Samuel.Franssens at econ.kuleuven.be> wrote:
> > Thanks for the insightful reply! I've studied linear mixed models at undergraduate level, but I was unfamiliar with GLMMs. It was kind of hard at first to find out how to analyze these data, but I'm glad I found this group. I'm also trying to read the book on ecology by Zuur (2009).
> 
> > My data come from an experiment in which people have to solve two types of reasoning problems, 4 of each kind. The noconflict problems are filler problems which are very easy, the conflict problems are harder, some people will get all 4 wrong, some will get all 4 right. I could leave the no conflict problems out of the analysis, but I want to conclude that the power manipulation had no effect on performance on noconflict problems, but it did on the conflict problems (to rule out alternative explanations such as decreased motivation/attention). I think an analysis on the full data set is preferred over 2 separate analyses per problem type.
> 
> Unfortunately the results don't indicate an interaction between the power manipulation and the conflict/noconflict nature of the question.
> 
> I would agree that in general we prefer to do an "omnibus" analysis incorporating all the observed data.  Unfortunately with a binary response there can be circumstances where you can't really get that much out of the data, precisely because it is so homogeneous.
> Paradoxically the difficult cases to fit with generalized linear models are those where the "signal" is very strong - the so-called complete separation cases.  In this case having almost all of the noconflict questions answered correctly means that you get very little information from those responses.
> 
> > I made the histograms and this confirms your expectations: for the no conflict problems, in all ?three groups nobody scores less than 3/4, which is also evident from the means in these cells. For the conflict problems, the data have a similar distribution in all three groups, most mass is on 4/4, with 0/4, 1/4, 2/4, 3/4 getting almost equal mass. So indeed, variability is large.
> 
> > I tried to fit some models with effect of type also varying over subjects, because I thought this would make sense intuitively (some people will be very good at noconflict, but very bad at conflict; others will be good at both), but then I consistently get correlations between random effects of -1, which I thought was an indication of convergence problems, so I left this term out.
> 
> > I have read about the different estimation approaches used in GLMM's and I've noticed experts recommend not using PQL. So I will now focus on Laplace and GHQ (I knew this was possible in glmer, but did not know how much nAGQ should be, therefore I also used glmmML). nAGQ 7 and 9 gives almost completely the same results, but they differ a bit from laplace:
> 
> > Generalized linear mixed model fit by the adaptive Gaussian Hermite 
> > approximation
> > Formula: accuracy ~ f_power * f_type + (1 | f_subject)
> > ? Data: syllogisms
> > ? AIC ? BIC logLik deviance
> > ?441.5 473.2 -213.7 ? ?427.5
> > Random effects:
> > ?Groups ? ?Name ? ? ? ?Variance Std.Dev.
> > ?f_subject (Intercept) 4.7286 ? 2.1745 Number of obs: 688, groups: 
> > f_subject, 86
> >
> > Fixed effects:
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|)
> > (Intercept) ? ? ? ? ? ? ? ? ?1.470526 ? 0.493530 ? 2.980 ?0.00289 ** 
> > f_powerhp ? ? ? ? ? ? ? ? ? ?0.124439 ? 0.690869 ? 0.180 ?0.85706 
> > f_powerlow ? ? ? ? ? ? ? ? ? 2.020807 ? 0.833203 ? 2.425 ?0.01529 * 
> > f_typenoconflict ? ? ? ? ? ? 3.267835 ? 0.643001 ? 5.082 3.73e-07 *** 
> > f_powerhp:f_typenoconflict ? 0.219783 ? 0.927490 ? 0.237 ?0.81268 
> > f_powerlow:f_typenoconflict -0.005484 ? 1.455748 ?-0.004 ?0.99699
> 
> > and
> 
> > Generalized linear mixed model fit by the Laplace approximation
> > Formula: accuracy ~ f_power * f_type + (1 | f_subject)
> > ? Data: syllogisms
> > ?AIC ? BIC logLik deviance
> > ?406 437.7 ? -196 ? ? ?392
> > Random effects:
> > ?Groups ? ?Name ? ? ? ?Variance Std.Dev.
> > ?f_subject (Intercept) 4.9968 ? 2.2353 Number of obs: 688, groups: 
> > f_subject, 86
> 
> > Fixed effects:
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|)
> > (Intercept) ? ? ? ? ? ? ? ? ?1.50745 ? ?0.50507 ? 2.985 ?0.00284 ** 
> > f_powerhp ? ? ? ? ? ? ? ? ? ?0.13083 ? ?0.70719 ? 0.185 ?0.85323 
> > f_powerlow ? ? ? ? ? ? ? ? ? 2.04121 ? ?0.85308 ? 2.393 ?0.01672 * 
> > f_typenoconflict ? ? ? ? ? ? 3.28715 ? ?0.64673 ? 5.083 3.72e-07 *** 
> > f_powerhp:f_typenoconflict ? 0.21680 ? ?0.93165 ? 0.233 ?0.81599 
> > f_powerlow:f_typenoconflict -0.01199 ? ?1.45807 ?-0.008 ?0.99344
> 
> In these cases the fits using adaptive Gauss-Hermite quadrature are considered more reliable.  It's actually a good thing that the results from aGHQ and the Laplace approximation are similar.  They should be.
> 
> > I don't really know what criterion I should use to choose between models. Are any of the two actually useful, or are my data not fit for this kind of analysis? Is there another kind of analysis I could try?
> 
> I would be concerned about the very high estimated standard error of the random effects for subject.  You may be able to draw conclusions about the power treatment and the type of question but I wouldn't be terribly confident of the results.  Others reading the list may have better suggestions than I can provide.
> 
> > Thanks a lot for your time. I also think I understand the comment on 
> > the random effects :)
> >
> > -----Original Message-----
> > From: dmbates at gmail.com [mailto:dmbates at gmail.com] <dmbates at gmail.com]> On Behalf Of 
> > Douglas Bates
> > Sent: Saturday 19 March 2011 10:02 AM
> > To: David Duffy
> > Cc: Franssens, Samuel; r-sig-mixed-models at r-project.org
> > Subject: Re: [R-sig-ME] generalized mixed linear models, glmmPQL and GLMER give very different results that both do not fit the data well...
> >
> > On Fri, Mar 18, 2011 at 9:54 PM, David Duffy <davidD at qimr.edu.au> wrote:
> >> On Fri, 18 Mar 2011, Franssens, Samuel wrote:
> >>
> >>> Hi,
> >>>
> >>> I have the following type of data: 86 subjects in three independent 
> >>> groups (high power vs low power vs control). Each subject solves 8 
> >>> reasoning problems of two kinds: conflict problems and noconflict 
> >>> problems. I measure accuracy in solving the reasoning problems. To
> >>> summarize: binary response, 1 within subject var (TYPE), 1 between subject var (POWER).
> >>>
> >>> I wanted to fit the following model: for problem i, person j:
> >>> logodds ( Y_ij ) = b_0j + b_1j TYPE_ij with b_0j = b_00 + b_01 
> >>> POWER_j + u_0j and b_1j = b_10 + b_11 POWER_j
> >>>
> >>> I think it makes sense, but I'm not sure.
> >>> Here are the observed cell means:
> >>> ? ? ? ? ? ?conflict ? ? ? noconflict control ? ? 0.6896552 ? ? ?
> >>> 0.9568966 high ? ? ? ?0.6935484 ? ? ?0.9677419 low ? ? ? ? 0.8846154 ? ? ?
> >>> 0.9903846
> >>>
> >>> GLMER gives me:
> >>> Formula: accuracy ~ f_power * f_type + (1 | subject)
> >>> ?Data: syllogisms
> >>> Random effects:
> >>> Groups ?Name ? ? ? ?Variance Std.Dev.
> >>> subject (Intercept) 4.9968 ? 2.2353
> >>> Number of obs: 688, groups: subject, 86
> >>>
> >>> Fixed effects:
> >>> ? ? ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|)
> >>> (Intercept) ? ? ? ? ? ? ? ? ?1.50745 ? ?0.50507 ? 2.985 ?0.00284 ** 
> >>> f_powerhp ? ? ? ? ? ? ? ? ? ?0.13083 ? ?0.70719 ? 0.185 ?0.85323 
> >>> f_powerlow ? ? ? ? ? ? ? ? ? 2.04121 ? ?0.85308 ? 2.393 ?0.01672 * 
> >>> f_typenoconflict ? ? ? ? ? ? 3.28715 ? ?0.64673 ? 5.083 3.72e-07 *** 
> >>> f_powerhp:f_typenoconflict ? 0.21680 ? ?0.93165 ? 0.233 ?0.81599 
> >>> f_powerlow:f_typenoconflict -0.01199 ? ?1.45807 ?-0.008 ?0.99344
> >>> ---
> >>>
> >>> Strange thing is that when you convert the estimates to 
> >>> probabilities, they are quite far off. For control, no conflict 
> >>> (intercept), the estimation from glmer is 1.5 -> 81% and for glmmPQL 
> >>> is 1.14 -> 75%, whereas the observed is: 68%.
> >>>
> >>> Am I doing something wrong?
> >>
> >> You are forgetting that your model includes a random intercept for 
> >> each subject.
> >
> > To follow up on David's comment a bit, you have a very large estimated standard deviation for the random effects for subject. ?An estimated standard deviation of 2.23 in the glmer fit is huge. ?On the scale of the linear predictor the random effects would swamp the contribution of the fixed effects - different subjects' random effects could easily differ by 8 or more (+/ 2 s.d.). ?Transformed to the probability scale that takes you from virtually impossible to virtually certain. ?To get that large an estimated standard deviation you would need different subjects exposed to the same experimental conditions and who get completely different results (all wrong vs all right). ?The estimate of the fixed effect for type is also very large, considering that this is a binary covariate.
> >
> > I would suggest plotting the data, say by first grouping the subjects into groups according to power then by subject then by type. ?You should see a lot of variability in ?responses according to subject within power.
> >
> > Regarding the difference between the glmer and the glmmPQL results, bear in mind that glmmPQL is based on successive approximation of the GLMM by an LMM. ?This case is rather extreme and the approximation is likely to break down. ?In fact, for the glmer fit, I would recommend using nAGQ=7 or nAGQ=9 to see if that changes the estimates substantially. ?The difference between the default evaluation of the deviance by the Laplace approximation and the more accurate adaptive Gauss-Hermite quadrature may be important in this case.
> >
> > I just saw your other message about the properties of the random 
> > effects. ?The values returned are the conditional modes of the random 
> > effects given the observed data. ?The mean 0 and standard deviation of
> > 2.3 apply to the marginal or unconditional distribution of the random effects, not the conditional distribution.
> >
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> R-sig-mixed-models mailing list
> R-sig-mixed-models at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> End of R-sig-mixed-models Digest, Vol 51, Issue 54
> **************************************************
>




More information about the R-sig-mixed-models mailing list