[R-sig-ME] glmer random effects structure: a case
Thierry Onkelinx
thierry.onkelinx at inbo.be
Mon Nov 16 18:05:53 CET 2015
Dear Simone,
nlme models correlation structures on the residuals. Those are the epsilons
in the model. Y = X * beta + epsilon. Don't confuse them with the output of
resid(model). In the case of a Gaussian model, they happen to be the same.
In case of a binomial, and many other distributions, they are not. Because
there is not such thing as epsilon in a binomial model. Y = Binom(pi),
logit(pi) = X * beta. Since there are no epsilons in the binomial model,
the correlation structures for nlme don't work.
The correlation structures in nlme work within the finest level of the
random effects. So in your case within dates rather than among dates. Which
is not what you are looking for.
Also note that adding a random intercept is equivalent to a compound
symmetry correlation structure on that variable. Having some correlation
structure is often sufficient.
Have a look at the INLA package (www.rinla.org) if you want to model the
temporal autocorrelation and willing to go Bayesian. INLA can model
correlated random effects. Be warned: it not for the faint of heart.
inla(Var1 ~ SEX + AGE + Var2meanID + Var2varIND + f(DATE, model = "AR1") +
f(IND, model = "iid"), data = mydata, family = "binomial")
Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
2015-11-16 16:45 GMT+01:00 Malcolm Fairbrother <M.Fairbrother op bristol.ac.uk>
:
> Dear Simone,
> Glad that was useful, and yes everything you say sounds right to me. For p
> values, my understanding is that LRTs are fine, however. Or you could also
> use a bootstrap, or MCMC.
> As for autocorrelation, sorry, I hadn't been thinking through the
> implications of your having a binary outcome variable. And I had been under
> the impression that nlme could fit logit models, but a quick investigation
> around the web suggests I was wrong about that. (If anyone knows otherwise,
> please correct me/us.)
> I am not an expert on dealing with autocorrelation in the case of binary
> outcomes. Someone else may have advice about that, however.
> Best wishes,
> Malcolm
>
>
> On 16 November 2015 at 16:07, Simone <miseno77 op hotmail.com> wrote:
>
> > Dear Malcolm,
> >
> > Thank you so much for your detailed (very interesting references!) and
> > helpful answer. I have centered Var2 by IND and I have used both the
> > individual-specific mean Var2 (Var2meanIND) as well as the
> > individual-specific centered Var2 (Var2varIND). I understand that this
> way
> > I can test if the variation among individuals (first case) or within them
> > (second case) relate to the response variable.
> >
> > As you suspected, DATEs are often close each other and it is quite
> > probable I have autocorrelated data. You mentioned that the nlme package
> > handles correlated residuals and I have found the code to do that but the
> > problem is that I cannot do it for my case study since the distribution I
> > am using is a binomial and nlme is only for linear mixed model, isn’it?
> >
> > For now, I have being using the below syntax and using LRT (anova)
> between
> > reduced nested models to compute the p-value for each predictor. I know
> > that LRT is very criticized but I have been asked to calculate p-values
> for
> > each predictor.
> >
> >
> > Mod1<-glmer(Var1 ~ SEX + AGE + Var2meanIND + Var2varIND + (1|DATE) +
> > (1|IND) , data = mydata, family = binomial, control =
> > glmerControl(optimizer="bobyqa"))
> >
> >
> > This way I am not accounting at all for the autocorrelation, do you have
> > any suggestions?
> > Thanks again,
> >
> > Simone
> >
> > ------------------------------
> > Date: Sun, 15 Nov 2015 16:41:28 +0100
> > Subject: Re: [R-sig-ME] glmer random effects structure: a case
> > From: M.Fairbrother op bristol.ac.uk
> > To: miseno77 op hotmail.com
> > CC: r-sig-mixed-models op r-project.org
> >
> >
> > Dear Simone,
> > How many INDs and DATEs are in your dataset? It sounds like you have
> > plenty of INDs, but it's less clear how many DATEs you have. If you have
> a
> > lot, you may have a situation of cross-classification: observations are
> > nested both in INDs and DATEs, but neither of those is nested in the
> either.
> > If you don't have many DATEs, it will make more sense to use fixed
> effects
> > for those. And even if you have a lot, if the DATEs are located close to
> > each other in time, you may have a lot of autocorrelation, and that
> > requires other techniques. (In R, you may need to use the older package
> > nlme, which allows for correlated residuals.)
> > In any event, if INDs are in many cases captured on multiple DATEs, it
> > definitely doesn't make sense to nest INDs in DATEs. Clearly they aren't
> > nested. (Assuming I've understood your data structure correctly.)
> > It also sounds like you should be centering Var2 by IND. This is pretty
> > much de rigueur in multilevel models with x variables that vary within
> > clusters. Enders and Tofighi 2007 is a useful, clear paper on this issue,
> > and you might also want to look at these recent papers by me and
> colleagues
> > in the Centre for Multilevel Modelling at Bristol:
> > doi:10.1017/psrm.2014.7
> > doi:10.1017/psrm.2013.24
> > Basically, take the mean of Var2 for each individual, and enter that as a
> > covariate. Then take the difference between the original Var2 and its
> mean
> > for that individual, and enter that as a covariate as well. You'll get
> two
> > pieces of information in your fitted model: the distinct "between" and
> > "within" effects of Var2. It sounds like that is what you want.
> > Hope that's useful.
> > - Malcolm
> >
> >
> >
> > Date: Sat, 14 Nov 2015 17:48:46 +0100
> > From: Simone <miseno77 op hotmail.com>
> > Cc: "r-sig-mixed-models op r-project.org"
> > <r-sig-mixed-models op r-project.org>
> > Subject: [R-sig-ME] glmer random effects structure: a case
> >
> >
> > Hi all,
> > I have a simple (but not that simple to me) question on how to specify
> the
> > random structure in R.A binary response variable (Var1) has been measured
> > from a number of individuals (IND) that have been susceptible of being
> > captured over a number of dates (DATE). I suspect that Var1 might depend
> > either on its sex (SEX), or age (AGE) or Var2 which is a continuous
> > variable measured from each individual every time it is captured. Since
> > Var2 is a measure of the quality of each individual, it is likely that
> some
> > individuals will tend to have greater values of Var2 than others during
> the
> > entire study period.Note that some individuals have been captured only
> one
> > time, other two, other three and so on (very unbalanced). For each date
> an
> > individual can be captured only one time.So, I have two groups: IND and
> > DATE. I would think this is a two-level model with IND nested to DATE so
> > that:
> > model1 <- glmer(Var1 ~ SEX + AGE + Var2 + (1|DATE/IND), family =
> binomial,
> > data = mydata)
> > Does it make sense? I think i am not taking into account the fact that
> the
> > mean of Var2 may be different among individuals but I don't know how to
> do
> > that.I would really appreciate an answer to this question that I am sure
> > would help me a lot to understand better how mixed models work.
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list