[R-sig-ME] glmer random effects structure: a case

Tue Nov 17 10:43:57 CET 2015

Dear Simone,

See chapter 5 of Zuur et al (2009) for more details.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-11-17 10:41 GMT+01:00 Simone <miseno77 op hotmail.com>:

> Dear Thierry,
>
> I have been struggling a bit to understand this paragraph of your answer:
> "Also note that adding a random intercept is equivalent to a compound
> symmetry correlation structure on that variable. Having some correlation
> structure is often sufficient."
>
> But after some googling I think I have grasped the concept and this has
> helped me to understand better the issue. Thanks you and the others for
> your suggestions and helpful comments.
> Simone
>
> ------------------------------
> Date: Mon, 16 Nov 2015 18:05:53 +0100
> Subject: Re: [R-sig-ME] glmer random effects structure: a case
> From: thierry.onkelinx op inbo.be
> To: M.Fairbrother op bristol.ac.uk
> CC: miseno77 op hotmail.com; r-sig-mixed-models op r-project.org
>
>
> Dear Simone,
>
> nlme models correlation structures on the residuals. Those are the
> epsilons in the model. Y = X * beta + epsilon. Don't confuse them with the
> output of resid(model). In the case of a Gaussian model, they happen to be
> the same. In case of a binomial, and many other distributions, they are
> not. Because there is not such thing as epsilon in a binomial model. Y =
> Binom(pi), logit(pi) = X * beta. Since there are no epsilons in the
> binomial model, the correlation structures for nlme don't work.
>
> The correlation structures in nlme work within the finest level of the
> random effects. So in your case within dates rather than among dates. Which
> is not what you are looking for.
>
> Also note that adding a random intercept is equivalent to a compound
> symmetry correlation structure on that variable. Having some correlation
> structure is often sufficient.
>
> Have a look at the INLA package (www.rinla.org) if you want to model the
> temporal autocorrelation and willing to go Bayesian. INLA can model
> correlated random effects. Be warned: it not for the faint of heart.
>
> inla(Var1 ~ SEX + AGE + Var2meanID + Var2varIND + f(DATE, model = "AR1") +
> f(IND, model = "iid"), data = mydata, family = "binomial")
>
> Best regards,
>
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2015-11-16 16:45 GMT+01:00 Malcolm Fairbrother <
> M.Fairbrother op bristol.ac.uk>:
>
> Dear Simone,
> Glad that was useful, and yes everything you say sounds right to me. For p
> values, my understanding is that LRTs are fine, however. Or you could also
> use a bootstrap, or MCMC.
> As for autocorrelation, sorry, I hadn't been thinking through the
> implications of your having a binary outcome variable. And I had been under
> the impression that nlme could fit logit models, but a quick investigation
> around the web suggests I was wrong about that. (If anyone knows otherwise,
> please correct me/us.)
> I am not an expert on dealing with autocorrelation in the case of binary
> outcomes. Someone else may have advice about that, however.
> Best wishes,
> Malcolm
>
>
> On 16 November 2015 at 16:07, Simone <miseno77 op hotmail.com> wrote:
>
> > Dear Malcolm,
> >
> > Thank you so much for your detailed (very interesting references!) and
> > helpful answer. I have centered Var2 by IND and I have used both the
> > individual-specific mean Var2 (Var2meanIND) as well as the
> > individual-specific centered Var2 (Var2varIND). I understand that this
> way
> > I can test if the variation among individuals (first case) or within them
> > (second case) relate to the response variable.
> >
> > As you suspected, DATEs are often close each other and it is quite
> > probable I have autocorrelated data. You mentioned that the nlme package
> > handles correlated residuals and I have found the code to do that but the
> > problem is that I cannot do it for my case study since the distribution I
> > am using is a binomial and nlme is only for linear mixed model, isn’it?
> >
> > For now, I have being using the below syntax and using LRT (anova)
> between
> > reduced nested models to compute the p-value for each predictor. I know
> > that LRT is very criticized but I have been asked to calculate p-values
> for
> > each predictor.
> >
> >
> > Mod1<-glmer(Var1 ~ SEX + AGE + Var2meanIND + Var2varIND + (1|DATE) +
> > (1|IND) , data = mydata, family = binomial, control =
> > glmerControl(optimizer="bobyqa"))
> >
> >
> > This way I am not accounting at all for the autocorrelation, do you have
> > any suggestions?
> > Thanks again,
> >
> > Simone
> >
> > ------------------------------
> > Date: Sun, 15 Nov 2015 16:41:28 +0100
> > Subject: Re: [R-sig-ME] glmer random effects structure: a case
> > From: M.Fairbrother op bristol.ac.uk
> > To: miseno77 op hotmail.com
> > CC: r-sig-mixed-models op r-project.org
> >
> >
> > Dear Simone,
> > How many INDs and DATEs are in your dataset? It sounds like you have
> > plenty of INDs, but it's less clear how many DATEs you have. If you have
> a
> > lot, you may have a situation of cross-classification: observations are
> > nested both in INDs and DATEs, but neither of those is nested in the
> either.
> > If you don't have many DATEs, it will make more sense to use fixed
> effects
> > for those. And even if you have a lot, if the DATEs are located close to
> > each other in time, you may have a lot of autocorrelation, and that
> > requires other techniques. (In R, you may need to use the older package
> > nlme, which allows for correlated residuals.)
> > In any event, if INDs are in many cases captured on multiple DATEs, it
> > definitely doesn't make sense to nest INDs in DATEs. Clearly they aren't
> > nested. (Assuming I've understood your data structure correctly.)
> > It also sounds like you should be centering Var2 by IND. This is pretty
> > much de rigueur in multilevel models with x variables that vary within
> > clusters. Enders and Tofighi 2007 is a useful, clear paper on this issue,
> > and you might also want to look at these recent papers by me and
> colleagues
> > in the Centre for Multilevel Modelling at Bristol:
> > doi:10.1017/psrm.2014.7
> > doi:10.1017/psrm.2013.24
> > Basically, take the mean of Var2 for each individual, and enter that as a
> > covariate. Then take the difference between the original Var2 and its
> mean
> > for that individual, and enter that as a covariate as well. You'll get
> two
> > pieces of information in your fitted model: the distinct "between" and
> > "within" effects of Var2. It sounds like that is what you want.
> > Hope that's useful.
> > - Malcolm
> >
> >
> >
> > Date: Sat, 14 Nov 2015 17:48:46 +0100
> > From: Simone <miseno77 op hotmail.com>
> > Cc: "r-sig-mixed-models op r-project.org"
> >         <r-sig-mixed-models op r-project.org>
> > Subject: [R-sig-ME] glmer random effects structure: a case
> >
> >
> > Hi all,
> > I have a simple (but not that simple to me) question on how to specify
> the
> > random structure in R.A binary response variable (Var1) has been measured
> > from a number of individuals (IND) that have been susceptible of being
> > captured over a number of dates (DATE). I suspect that Var1 might depend
> > either on its sex (SEX), or age (AGE) or Var2 which is a continuous
> > variable measured from each individual every time it is captured. Since
> > Var2 is a measure of the quality of each individual, it is likely that
> some
> > individuals will tend to have greater values of Var2 than others during
> the
> > entire study period.Note that some individuals have been captured only
> one
> > time, other two, other three and so on (very unbalanced). For each date
> an
> > individual can be captured only one time.So, I have two groups: IND and
> > DATE. I would think this is a two-level model with IND nested to DATE so
> > that:
> > model1 <- glmer(Var1 ~ SEX + AGE + Var2 + (1|DATE/IND), family =
> binomial,
> > data = mydata)
> > Does it make sense? I think i am not taking into account the fact that
> the
> > mean of Var2 may be different among individuals but I don't know how to
> do
> > that.I would really appreciate an answer to this question that I am sure
> > would help me a lot to understand better how mixed models work.
> >
> >
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
>

	[[alternative HTML version deleted]]