[R-sig-ME] Getting out of using a mixed model if "nested" data points are not actually correlated?

Ben Bolker bbolker at gmail.com
Tue Mar 22 14:12:38 CET 2011


On 03/22/2011 08:50 AM, M.S.Muller wrote:
> Dear all,
> 
> I am trying to analyze some behavioural data that have a
> zero-inflated Poisson distribution. I want to use the “zeroinfl”
> function from the “pscl” package that allows me to analyze a Poisson
> process (that generates the count portion of the data) and a binomial
> process (that generates the excess zeros) all within the same model.
> The only problem is that I have a paired design: 32 individuals split
> into sixteen sibling pairs (each sib in a pair received a different
> treatment; treatment is the predictor of main interest, although I
> have a couple of other covariates). Ideally, I would include “sibling
> pair” as a random factor, but unfortunately inclusion of random
> factors is not possible in the “zeroinfl” function.
> 
> Is there a way for me to demonstrate that siblings are statistically
> independent (e.g. by showing no correlations in their residuals or
> something) and therefore justify treating them as independent data
> points in the model?
> 
> Thanks in advance for your help.
> 
> Martina
> 

  Three possibilities: (1) use glmmADMB with zeroinfl=TRUE (at the
moment it will only handle a single random effect, but it seems that
your problem doesn't require more); (2) use  MCMCglmm (see the course
notes for an example; (3) as you suggest, fit the model and then show a
lack of statistically significant correlations in the residuals.

  I would probably try #1 first.



> 
> 
> 
> 
> 
> 
> On 22-03-11, r-sig-mixed-models-request at r-project.org wrote:
>> 
>> Send R-sig-mixed-models mailing list submissions to 
>> r-sig-mixed-models at r-project.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models or, via
>> email, send a message with subject or body 'help' to 
>> r-sig-mixed-models-request at r-project.org
>> 
>> You can reach the person managing the list at 
>> r-sig-mixed-models-owner at r-project.org
>> 
>> When replying, please edit your Subject line so it is more
>> specific than "Re: Contents of R-sig-mixed-models digest..."
>> 
>> 
>> Today's Topics:
>> 
>> 1. Re: Offtopic: Why I am receiving duplicate emails? (Douglas
>> Bates) 2. Re: Satterthwaite?s df in lme (Douglas Bates) 3. hlm
>> (software) exploratory analysis (Sebasti?n Daza) 4. Re:
>> Observation-level random effect to model	overdispersion (John
>> Maindonald) 5. Library (lattice) (Shankar Lanke) 6. Re: Library
>> (lattice) (Joshua Wiley) 7. Re: generalized mixed linear models,
>> glmmPQL and GLMER give very different results that both do not fit
>> the data well... (Franssens, Samuel)
>> 
>> 
>> ----------------------------------------------------------------------
>>
>>
>> 
Message: 1
>> Date: Mon, 21 Mar 2011 13:14:33 -0500 From: Douglas Bates
>> <bates at stat.wisc.edu> To: Manuel Ram?n Fern?ndez
>> <m.ramon.fernandez at gmail.com> Cc: r-sig-mixed-models at r-project.org 
>> Subject: Re: [R-sig-ME] Offtopic: Why I am receiving duplicate
>> emails? Message-ID: 
>> <AANLkTim4A_u+uXnRi-m+W5SrNou+j1JGg_kuTprcHi26 at mail.gmail.com> 
>> Content-Type: text/plain; charset=ISO-8859-1
>> 
>> 2011/3/13 Manuel Ram?n Fern?ndez <m.ramon.fernandez at gmail.com>:
>>> For acoupleofweeksI amreceivingduplicateemails fromthe
>>> list.Isthis happeningtoanyone?How I canfix it?
>> 
>> I don't know why you are receiving duplicates.  You address above
>> is subscribed.  Is it possible that you have another address
>> subscribed and mail to the second address is forwarded to your
>> gmail.com account?
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 2 Date: Mon, 21 Mar 2011 15:13:34 -0500 From: Douglas
>> Bates <bates at stat.wisc.edu> To: "Flores-de-Gracia, Eric"
>> <eef201 at exeter.ac.uk> Cc: "r-sig-mixed-models at r-project.org" 
>> <r-sig-mixed-models at r-project.org> Subject: Re: [R-sig-ME]
>> Satterthwaite?s df in lme Message-ID: 
>> <AANLkTineiSA2Xu2PaOzoU4AJdfSCZHaXfeXJM8JoHTDF at mail.gmail.com> 
>> Content-Type: text/plain; charset=ISO-8859-1
>> 
>> On Thu, Feb 24, 2011 at 12:05 PM, Flores-de-Gracia, Eric 
>> <eef201 at exeter.ac.uk> wrote:
>> 
>>> Hi all.
>> 
>>> I am just running a lme model and found some criticism regarding
>>> the fact that DF appears just as integers.
>> 
>>> What are the implications of this, in terms of reporting those DF
>>> and the reliability of my model when other packages like SPSS
>>> that do mixel models reports on DF based on Satterthwait?s
>>> correction? My data set is unbalanced, so should I expect this to
>>> affect the calculation of DF using lme?
>> 
>> There are many formulas for approximation of degrees of freedom in 
>> linear mixed models but they are all approximations.   That is,
>> the ratio of the estimate and the approximate standard error isn't 
>> distributed as a T so the question of which degrees of freedom
>> works best is a bit vague to begin with.
>> 
>> Typically the exact number of degrees of freedom chosen only makes
>> a difference when the sample sizes are small or the number of
>> levels for one of more of the grouping factors for the random
>> effects is small. In practice the difference between a T
>> distribution with 40 degrees of freedom and a T distribution with
>> 400 degrees of freedom is negligible.
>> 
>> If the degrees of freedom provided by the summary of an lme fit is 
>> reasonably large then the particular approximation used is not 
>> important.
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 3 Date: Mon, 21 Mar 2011 17:20:19 -0500 From: Sebasti?n
>> Daza <sebastian.daza at gmail.com> To:
>> "r-sig-mixed-models at r-project.org" 
>> <r-sig-mixed-models at r-project.org> Subject: [R-sig-ME] hlm
>> (software) exploratory analysis Message-ID:
>> <4D87CF23.7010108 at gmail.com> Content-Type: text/plain;
>> charset=ISO-8859-1; format=flowed
>> 
>> Hi everyone, Does anyone know if there is a  procedure to conduct a
>> exploratory analysis as HLM performs? Exploratory analysis is
>> defined as "estimated level-2 coefficients and their standard
>> errors obtained by regressing EB residuals on level-2 predictors
>> selected for possible inclusion in subsequent HLM runs".
>> 
>> Thank you in advance. -- Sebasti?n Daza sebastian.daza at gmail.com
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 4 Date: Tue, 22 Mar 2011 09:23:07 +1100 From: John
>> Maindonald <john.maindonald at anu.edu.au> To: "M.S.Muller"
>> <m.s.muller at rug.nl> Cc: r-sig-mixed-models at r-project.org Subject:
>> Re: [R-sig-ME] Observation-level random effect to model 
>> overdispersion Message-ID:
>> <E50E2806-9231-41BC-91E3-5FE147A2379B at anu.edu.au> Content-Type:
>> text/plain; charset=us-ascii
>> 
>> One point, additional to other responses.  There is just one 
>> "miniature variance component" (by the way, not miniature in the
>> sense that it has to be small).  As I understand it, a normal
>> distribution with this variance generates one random effect, on the
>> scale of the linear predictor, for each observation. On the scale
>> of the response, well . . .
>> 
>> John Maindonald             email: john.maindonald at anu.edu.au phone
>> : +61 2 (6125)3473    fax  : +61 2(6125)5549 Centre for Mathematics
>> & Its Applications, Room 1194, John Dedman Mathematical Sciences
>> Building (Building 27) Australian National University, Canberra ACT
>> 0200. http://www.maths.anu.edu.au/~johnm John Maindonald
>> email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473    fax
>> : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room
>> 1194, John Dedman Mathematical Sciences Building (Building 27) 
>> Australian National University, Canberra ACT 0200. 
>> http://www.maths.anu.edu.au/~johnm
>> 
>> On 21/03/2011, at 10:51 PM, M.S.Muller wrote:
>> 
>>> Dear all,
>>> 
>>> I'm trying to analyze some strongly overdispersed
>>> Poisson-distributed data using R's mixed effects model function
>>> "lmer". Recently, several people have suggested incorporating an
>>> observation-level random effect, which would model the excess
>>> variation and solve the problem of underestimated standard errors
>>> that arises with overdispersed data. It seems to be working, but
>>> I feel uneasy using this method because I don't actually
>>> understand conceptually what it is doing. Does it package up the
>>> extra, non-Poisson variation into a miniature variance component
>>> for each data point? But then I don't understand how one ends up
>>> with non-zero residuals and why one can't just do this for any
>>> analyses (even with normally-distributed data) in which one would
>>> like to reduce noise.
>>> 
>>> I may be way off base here, but does this approach model some
>>> kind of mixture distribution that's a combination of Poisson and
>>> whatever distribution the extra variation is? I've read that
>>> people often use a negative binomial distribution (aka
>>> Poisson-gamma) to model overdispersed count data in which they
>>> assume that the process is Poisson (so they use a log link) but
>>> the extra variation is a gamma distribution (in which variance is
>>> proportional to square of the mean). The frequently referred to
>>> paper by Elston et al (2001) describes modeling a
>>> Poisson-lognormal distribution in which overdispersion arises
>>> from errors taking on a lognormal distribution. Is the approach
>>> of using the observation-level random effect doing something
>>> similar, and simply assuming some kind of Poisson-normal mixed
>>> distribution? Does this approach therefore assume that the
>>> observation-level variance is normally distributed?
>>> 
>>> If anyone could give me any guidance on this, I would appreciate
>>> it very much.
>>> 
>>> Martina Muller
>>> 
>>> _______________________________________________ 
>>> R-sig-mixed-models at r-project.org mailing list 
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 5 Date: Mon, 21 Mar 2011 23:16:53 -0400 From: Shankar
>> Lanke <shankarlanke at gmail.com> To:
>> r-sig-mixed-models at r-project.org Subject: [R-sig-ME] Library
>> (lattice) Message-ID: 
>> <AANLkTim-Xz+T9oE2WYHQyUw9pgwLt-XgDRp+gm2+e_fu at mail.gmail.com> 
>> Content-Type: text/plain
>> 
>>> 
>>> Dear All,
>>> 
>> 
>> 
>>> I am looking for a code to plot X and Multiple Y scatter
>>> plot.(Table below). I used the following R-Code ,however I am
>>> unable to figure out how to plot lines plot for Y1 and show the
>>> ID number on top of each plot.
>>> 
>> 
>>> library("lattice") xyplot(log(Y1)+log(Y2)~X|ID,layout=c(0,4))
>> 
>>> 
>>> ID  X    Y1  Y2 1  0.5  0.4   1 1  0.2  0.1   2 1  1      2    3 
>>> 1  4      5    4 2  2.5   3    5 2  3      4    6 2  0.5  0.4
>>> 7 2  0.2  0.1   8 2   1      2   9 2   4      5   8 2  2.5   3
>>> 7 2  3      4    6 3  0.5  0.4  5 3  0.2  0.1  4 3  1      2 3
>>> 4      5 3  2.5   3 3  3      4
>>> 
>>> Thank you very much for your suggestions.
>>> 
>> 
>> Regards, Shankar Lanke Ph.D. University at Buffalo Office #
>> 716-645-4853 Fax # 716-645-2886 Cell # 678-232-3567
>> 
>> [[alternative HTML version deleted]]
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 6 Date: Mon, 21 Mar 2011 21:02:54 -0700 From: Joshua Wiley
>> <jwiley.psych at gmail.com> To: Shankar Lanke
>> <shankarlanke at gmail.com> Cc: r-sig-mixed-models at r-project.org 
>> Subject: Re: [R-sig-ME] Library (lattice) Message-ID: 
>> <AANLkTimQ+QOTKqksosyRO7cY0Ah52gyL0DS6HOxRt05A at mail.gmail.com> 
>> Content-Type: text/plain; charset=UTF-8
>> 
>> Dear Shankar,
>> 
>> It seems like this is more of a question for R-help than mixed
>> models, but in any case, I think this does what you want.  See
>> comments in the code for explanation.
>> 
>> 
>> HTH,
>> 
>> Josh
>> 
>> ############################ ## Only using part of your data dat <-
>> read.table(textConnection(" ID  X    Y1  Y2 1  0.5  0.4   1 1  0.2
>> 0.1   2 1  1      2    3 1  4      5    4 2  2.5   3    5 2  3
>> 4    6 2  0.5  0.4   7 2  0.2  0.1   8 2   1      2   9 2   4
>> 5   8 2  2.5   3    7 2  3      4    6 3  0.5  0.4  5 3  0.2  0.1
>> 4"), header = TRUE) closeAllConnections()
>> 
>> ## Sorting will making the lines prettier dat <- dat[order(dat$ID,
>> dat$X), ] ## Convert ID to a factor to show its labels dat$ID <-
>> factor(dat$ID)
>> 
>> require(lattice) ## type = "l" specifies lines rather than points 
>> xyplot(log(Y1) + log(Y2) ~ X | ID, data = dat, type = "l", 
>> layout=c(0,4)) ############################
>> 
>> On Mon, Mar 21, 2011 at 8:16 PM, Shankar Lanke
>> <shankarlanke at gmail.com> wrote:
>>>> 
>>>> Dear All,
>>>> 
>>> 
>>> 
>>>> I am looking for a code to plot X and Multiple Y scatter
>>>> plot.(Table below). I used the following R-Code ,however I am
>>>> unable to figure out how to plot lines plot for Y1 and show the
>>>> ID number on top of each plot.
>>>> 
>>> 
>>>> library("lattice") xyplot(log(Y1)+log(Y2)~X|ID,layout=c(0,4))
>>> 
>>>> 
>>>> ID ?X ? ?Y1 ?Y2 1 ?0.5 ?0.4 ? 1 1 ?0.2 ?0.1 ? 2 1 ?1 ? ? ?2 ?
>>>> ?3 1 ?4 ? ? ?5 ? ?4 2 ?2.5 ? 3 ? ?5 2 ?3 ? ? ?4 ? ?6 2 ?0.5
>>>> ?0.4 ? 7 2 ?0.2 ?0.1 ? 8 2 ? 1 ? ? ?2 ? 9 2 ? 4 ? ? ?5 ? 8 2
>>>> ?2.5 ? 3 ? ?7 2 ?3 ? ? ?4 ? ?6 3 ?0.5 ?0.4 ?5 3 ?0.2 ?0.1 ?4 3
>>>> ?1 ? ? ?2 3 ? 4 ? ? ?5 3 ?2.5 ? 3 3 ?3 ? ? ?4
>>>> 
>>>> Thank you very much for your suggestions.
>>>> 
>>> 
>>> Regards, Shankar Lanke Ph.D. University at Buffalo Office #
>>> 716-645-4853 Fax # 716-645-2886 Cell # 678-232-3567
>>> 
>>> ? ? ? ?[[alternative HTML version deleted]]
>>> 
>>> _______________________________________________ 
>>> R-sig-mixed-models at r-project.org mailing list 
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> 
>> 
>> 
>> 
>> -- Joshua Wiley Ph.D. Student, Health Psychology University of
>> California, Los Angeles http://www.joshuawiley.com/
>> 
>> 
>> 
>> ------------------------------
>> 
>> Message: 7 Date: Tue, 22 Mar 2011 08:24:51 +0000 From: "Franssens,
>> Samuel" <Samuel.Franssens at econ.kuleuven.be> To:
>> "r-sig-mixed-models at r-project.org" 
>> <r-sig-mixed-models at r-project.org> Subject: Re: [R-sig-ME]
>> generalized mixed linear models, glmmPQL and GLMER give very
>> different results that both do not fit the data well... 
>> Message-ID: 
>> <B12069DAD987FC4FB5B3B08B752F8E430144873A at ECONMBX2A.econ.kuleuven.ac.be>
>>
>>  Content-Type: text/plain; charset="iso-8859-1"
>> 
>> Thanks a lot for taking the time to help. At least now I know I am
>> not doing something silly. I re-analyzed data from an earlier
>> experiment, where we did a wrong analysis (treating proportions of
>> correct responses per subject as continuous variable, and then
>> doing a repeated measures anova, without correction;
>> http://ppw.kuleuven.be/reason/wim/data/COGNIT_2010_correctedproofs.pdf
>> , exp1). In this experiment, we used the same design, but there was
>> no power manipulation (so you have the same 8 reasoning problems
>> per person, two types of problem, conflict & no conflict, but no
>> between subjects variable). Here, glmer gives very nice results.
>> 
>> I look forward to reading a chapter in your book on GLMMs, as this
>> is a very common design in (social) psychology (and also very often
>> the wrong analysis is done, cf. above   :-)   )
>> 
>> Sam.
>> 
>> -----Original Message----- From: dmbates at gmail.com
>> [mailto:dmbates at gmail.com] <dmbates at gmail.com]> On Behalf Of
>> Douglas Bates Sent: Monday 21 March 2011 6:36 PM To: Franssens,
>> Samuel Cc: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME]
>> generalized mixed linear models, glmmPQL and GLMER give very
>> different results that both do not fit the data well...
>> 
>> On Sat, Mar 19, 2011 at 6:17 AM, Franssens, Samuel
>> <Samuel.Franssens at econ.kuleuven.be> wrote:
>>> Thanks for the insightful reply! I've studied linear mixed models
>>> at undergraduate level, but I was unfamiliar with GLMMs. It was
>>> kind of hard at first to find out how to analyze these data, but
>>> I'm glad I found this group. I'm also trying to read the book on
>>> ecology by Zuur (2009).
>> 
>>> My data come from an experiment in which people have to solve two
>>> types of reasoning problems, 4 of each kind. The noconflict
>>> problems are filler problems which are very easy, the conflict
>>> problems are harder, some people will get all 4 wrong, some will
>>> get all 4 right. I could leave the no conflict problems out of
>>> the analysis, but I want to conclude that the power manipulation
>>> had no effect on performance on noconflict problems, but it did
>>> on the conflict problems (to rule out alternative explanations
>>> such as decreased motivation/attention). I think an analysis on
>>> the full data set is preferred over 2 separate analyses per
>>> problem type.
>> 
>> Unfortunately the results don't indicate an interaction between the
>> power manipulation and the conflict/noconflict nature of the
>> question.
>> 
>> I would agree that in general we prefer to do an "omnibus" analysis
>> incorporating all the observed data.  Unfortunately with a binary
>> response there can be circumstances where you can't really get that
>> much out of the data, precisely because it is so homogeneous. 
>> Paradoxically the difficult cases to fit with generalized linear
>> models are those where the "signal" is very strong - the so-called
>> complete separation cases.  In this case having almost all of the
>> noconflict questions answered correctly means that you get very
>> little information from those responses.
>> 
>>> I made the histograms and this confirms your expectations: for
>>> the no conflict problems, in all ?three groups nobody scores less
>>> than 3/4, which is also evident from the means in these cells.
>>> For the conflict problems, the data have a similar distribution
>>> in all three groups, most mass is on 4/4, with 0/4, 1/4, 2/4, 3/4
>>> getting almost equal mass. So indeed, variability is large.
>> 
>>> I tried to fit some models with effect of type also varying over
>>> subjects, because I thought this would make sense intuitively
>>> (some people will be very good at noconflict, but very bad at
>>> conflict; others will be good at both), but then I consistently
>>> get correlations between random effects of -1, which I thought
>>> was an indication of convergence problems, so I left this term
>>> out.
>> 
>>> I have read about the different estimation approaches used in
>>> GLMM's and I've noticed experts recommend not using PQL. So I
>>> will now focus on Laplace and GHQ (I knew this was possible in
>>> glmer, but did not know how much nAGQ should be, therefore I also
>>> used glmmML). nAGQ 7 and 9 gives almost completely the same
>>> results, but they differ a bit from laplace:
>> 
>>> Generalized linear mixed model fit by the adaptive Gaussian
>>> Hermite approximation Formula: accuracy ~ f_power * f_type + (1 |
>>> f_subject) ? Data: syllogisms ? AIC ? BIC logLik deviance ?441.5
>>> 473.2 -213.7 ? ?427.5 Random effects: ?Groups ? ?Name ? ? ?
>>> ?Variance Std.Dev. ?f_subject (Intercept) 4.7286 ? 2.1745 Number
>>> of obs: 688, groups: f_subject, 86
>>> 
>>> Fixed effects: ? ? ? ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error z
>>> value Pr(>|z|) (Intercept) ? ? ? ? ? ? ? ? ?1.470526 ? 0.493530 ?
>>> 2.980 ?0.00289 ** f_powerhp ? ? ? ? ? ? ? ? ? ?0.124439 ?
>>> 0.690869 ? 0.180 ?0.85706 f_powerlow ? ? ? ? ? ? ? ? ? 2.020807 ?
>>> 0.833203 ? 2.425 ?0.01529 * f_typenoconflict ? ? ? ? ? ? 3.267835
>>> ? 0.643001 ? 5.082 3.73e-07 *** f_powerhp:f_typenoconflict ?
>>> 0.219783 ? 0.927490 ? 0.237 ?0.81268 f_powerlow:f_typenoconflict
>>> -0.005484 ? 1.455748 ?-0.004 ?0.99699
>> 
>>> and
>> 
>>> Generalized linear mixed model fit by the Laplace approximation 
>>> Formula: accuracy ~ f_power * f_type + (1 | f_subject) ? Data:
>>> syllogisms ?AIC ? BIC logLik deviance ?406 437.7 ? -196 ? ? ?392 
>>> Random effects: ?Groups ? ?Name ? ? ? ?Variance Std.Dev. 
>>> ?f_subject (Intercept) 4.9968 ? 2.2353 Number of obs: 688,
>>> groups: f_subject, 86
>> 
>>> Fixed effects: ? ? ? ? ? ? ? ? ? ? ? ? ? ?Estimate Std. Error z
>>> value Pr(>|z|) (Intercept) ? ? ? ? ? ? ? ? ?1.50745 ? ?0.50507 ?
>>> 2.985 ?0.00284 ** f_powerhp ? ? ? ? ? ? ? ? ? ?0.13083 ? ?0.70719
>>> ? 0.185 ?0.85323 f_powerlow ? ? ? ? ? ? ? ? ? 2.04121 ? ?0.85308
>>> ? 2.393 ?0.01672 * f_typenoconflict ? ? ? ? ? ? 3.28715 ?
>>> ?0.64673 ? 5.083 3.72e-07 *** f_powerhp:f_typenoconflict ?
>>> 0.21680 ? ?0.93165 ? 0.233 ?0.81599 f_powerlow:f_typenoconflict
>>> -0.01199 ? ?1.45807 ?-0.008 ?0.99344
>> 
>> In these cases the fits using adaptive Gauss-Hermite quadrature are
>> considered more reliable.  It's actually a good thing that the
>> results from aGHQ and the Laplace approximation are similar.  They
>> should be.
>> 
>>> I don't really know what criterion I should use to choose between
>>> models. Are any of the two actually useful, or are my data not
>>> fit for this kind of analysis? Is there another kind of analysis
>>> I could try?
>> 
>> I would be concerned about the very high estimated standard error
>> of the random effects for subject.  You may be able to draw
>> conclusions about the power treatment and the type of question but
>> I wouldn't be terribly confident of the results.  Others reading
>> the list may have better suggestions than I can provide.
>> 
>>> Thanks a lot for your time. I also think I understand the comment
>>> on the random effects :)
>>> 
>>> -----Original Message----- From: dmbates at gmail.com
>>> [mailto:dmbates at gmail.com] <dmbates at gmail.com]> On Behalf Of 
>>> Douglas Bates Sent: Saturday 19 March 2011 10:02 AM To: David
>>> Duffy Cc: Franssens, Samuel; r-sig-mixed-models at r-project.org 
>>> Subject: Re: [R-sig-ME] generalized mixed linear models, glmmPQL
>>> and GLMER give very different results that both do not fit the
>>> data well...
>>> 
>>> On Fri, Mar 18, 2011 at 9:54 PM, David Duffy <davidD at qimr.edu.au>
>>> wrote:
>>>> On Fri, 18 Mar 2011, Franssens, Samuel wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I have the following type of data: 86 subjects in three
>>>>> independent groups (high power vs low power vs control). Each
>>>>> subject solves 8 reasoning problems of two kinds: conflict
>>>>> problems and noconflict problems. I measure accuracy in
>>>>> solving the reasoning problems. To summarize: binary
>>>>> response, 1 within subject var (TYPE), 1 between subject var
>>>>> (POWER).
>>>>> 
>>>>> I wanted to fit the following model: for problem i, person
>>>>> j: logodds ( Y_ij ) = b_0j + b_1j TYPE_ij with b_0j = b_00 +
>>>>> b_01 POWER_j + u_0j and b_1j = b_10 + b_11 POWER_j
>>>>> 
>>>>> I think it makes sense, but I'm not sure. Here are the
>>>>> observed cell means: ? ? ? ? ? ?conflict ? ? ? noconflict
>>>>> control ? ? 0.6896552 ? ? ? 0.9568966 high ? ? ? ?0.6935484 ?
>>>>> ? ?0.9677419 low ? ? ? ? 0.8846154 ? ? ? 0.9903846
>>>>> 
>>>>> GLMER gives me: Formula: accuracy ~ f_power * f_type + (1 |
>>>>> subject) ?Data: syllogisms Random effects: Groups ?Name ? ? ?
>>>>> ?Variance Std.Dev. subject (Intercept) 4.9968 ? 2.2353 Number
>>>>> of obs: 688, groups: subject, 86
>>>>> 
>>>>> Fixed effects: ? ? ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error
>>>>> z value Pr(>|z|) (Intercept) ? ? ? ? ? ? ? ? ?1.50745 ?
>>>>> ?0.50507 ? 2.985 ?0.00284 ** f_powerhp ? ? ? ? ? ? ? ? ?
>>>>> ?0.13083 ? ?0.70719 ? 0.185 ?0.85323 f_powerlow ? ? ? ? ? ? ?
>>>>> ? ? 2.04121 ? ?0.85308 ? 2.393 ?0.01672 * f_typenoconflict ?
>>>>> ? ? ? ? ? 3.28715 ? ?0.64673 ? 5.083 3.72e-07 *** 
>>>>> f_powerhp:f_typenoconflict ? 0.21680 ? ?0.93165 ? 0.233
>>>>> ?0.81599 f_powerlow:f_typenoconflict -0.01199 ? ?1.45807
>>>>> ?-0.008 ?0.99344 ---
>>>>> 
>>>>> Strange thing is that when you convert the estimates to 
>>>>> probabilities, they are quite far off. For control, no
>>>>> conflict (intercept), the estimation from glmer is 1.5 -> 81%
>>>>> and for glmmPQL is 1.14 -> 75%, whereas the observed is:
>>>>> 68%.
>>>>> 
>>>>> Am I doing something wrong?
>>>> 
>>>> You are forgetting that your model includes a random intercept
>>>> for each subject.
>>> 
>>> To follow up on David's comment a bit, you have a very large
>>> estimated standard deviation for the random effects for subject.
>>> ?An estimated standard deviation of 2.23 in the glmer fit is
>>> huge. ?On the scale of the linear predictor the random effects
>>> would swamp the contribution of the fixed effects - different
>>> subjects' random effects could easily differ by 8 or more (+/ 2
>>> s.d.). ?Transformed to the probability scale that takes you from
>>> virtually impossible to virtually certain. ?To get that large an
>>> estimated standard deviation you would need different subjects
>>> exposed to the same experimental conditions and who get
>>> completely different results (all wrong vs all right). ?The
>>> estimate of the fixed effect for type is also very large,
>>> considering that this is a binary covariate.
>>> 
>>> I would suggest plotting the data, say by first grouping the
>>> subjects into groups according to power then by subject then by
>>> type. ?You should see a lot of variability in ?responses
>>> according to subject within power.
>>> 
>>> Regarding the difference between the glmer and the glmmPQL
>>> results, bear in mind that glmmPQL is based on successive
>>> approximation of the GLMM by an LMM. ?This case is rather extreme
>>> and the approximation is likely to break down. ?In fact, for the
>>> glmer fit, I would recommend using nAGQ=7 or nAGQ=9 to see if
>>> that changes the estimates substantially. ?The difference between
>>> the default evaluation of the deviance by the Laplace
>>> approximation and the more accurate adaptive Gauss-Hermite
>>> quadrature may be important in this case.
>>> 
>>> I just saw your other message about the properties of the random
>>>  effects. ?The values returned are the conditional modes of the
>>> random effects given the observed data. ?The mean 0 and standard
>>> deviation of 2.3 apply to the marginal or unconditional
>>> distribution of the random effects, not the conditional
>>> distribution.
>>> 
>> 
>> 
>> 
>> ------------------------------
>> 
>> _______________________________________________ R-sig-mixed-models
>> mailing list R-sig-mixed-models at r-project.org 
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
>> 
>> End of R-sig-mixed-models Digest, Vol 51, Issue 54 
>> **************************************************
>> 
> 
> _______________________________________________ 
> R-sig-mixed-models at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list