[R-sig-ME] glmer random effects structure: a case

Tue Nov 17 10:41:14 CET 2015

Dear Thierry,
I have been struggling a bit to understand this paragraph of your answer:"Also note that adding a random intercept is equivalent to a compound symmetry correlation structure on that variable. Having some correlation structure is often sufficient."
But after some googling I think I have grasped the concept and this has helped me to understand better the issue. Thanks you and the others for your suggestions and helpful comments.Simone
Date: Mon, 16 Nov 2015 18:05:53 +0100
Subject: Re: [R-sig-ME] glmer random effects structure: a case
From: thierry.onkelinx at inbo.be
To: M.Fairbrother at bristol.ac.uk
CC: miseno77 at hotmail.com; r-sig-mixed-models at r-project.org

Dear Simone,
nlme models correlation structures on the residuals. Those are the epsilons in the model. Y = X * beta + epsilon. Don't confuse them with the output of resid(model). In the case of a Gaussian model, they happen to be the same. In case of a binomial, and many other distributions, they are not. Because there is not such thing as epsilon in a binomial model. Y = Binom(pi), logit(pi) = X * beta. Since there are no epsilons in the binomial model, the correlation structures for nlme don't work.
The correlation structures in nlme work within the finest level of the random effects. So in your case within dates rather than among dates. Which is not what you are looking for.
Also note that adding a random intercept is equivalent to a compound symmetry correlation structure on that variable. Having some correlation structure is often sufficient.
Have a look at the INLA package (www.rinla.org) if you want to model the temporal autocorrelation and willing to go Bayesian. INLA can model correlated random effects. Be warned: it not for the faint of heart.
inla(Var1 ~ SEX + AGE + Var2meanID + Var2varIND + f(DATE, model = "AR1") + f(IND, model = "iid"), data = mydata, family = "binomial")
Best regards, ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest 
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance 
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner 
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey

2015-11-16 16:45 GMT+01:00 Malcolm Fairbrother <M.Fairbrother at bristol.ac.uk>:
Dear Simone,

Glad that was useful, and yes everything you say sounds right to me. For p

values, my understanding is that LRTs are fine, however. Or you could also

use a bootstrap, or MCMC.

As for autocorrelation, sorry, I hadn't been thinking through the

implications of your having a binary outcome variable. And I had been under

the impression that nlme could fit logit models, but a quick investigation

around the web suggests I was wrong about that. (If anyone knows otherwise,

please correct me/us.)

I am not an expert on dealing with autocorrelation in the case of binary

outcomes. Someone else may have advice about that, however.

Best wishes,

Malcolm

On 16 November 2015 at 16:07, Simone <miseno77 at hotmail.com> wrote:

> Dear Malcolm,

>

> Thank you so much for your detailed (very interesting references!) and

> helpful answer. I have centered Var2 by IND and I have used both the

> individual-specific mean Var2 (Var2meanIND) as well as the

> individual-specific centered Var2 (Var2varIND). I understand that this way

> I can test if the variation among individuals (first case) or within them

> (second case) relate to the response variable.

>

> As you suspected, DATEs are often close each other and it is quite

> probable I have autocorrelated data. You mentioned that the nlme package

> handles correlated residuals and I have found the code to do that but the

> problem is that I cannot do it for my case study since the distribution I

> am using is a binomial and nlme is only for linear mixed model, isn’it?

>

> For now, I have being using the below syntax and using LRT (anova) between

> reduced nested models to compute the p-value for each predictor. I know

> that LRT is very criticized but I have been asked to calculate p-values for

> each predictor.

>

>

> Mod1<-glmer(Var1 ~ SEX + AGE + Var2meanIND + Var2varIND + (1|DATE) +

> (1|IND) , data = mydata, family = binomial, control =

> glmerControl(optimizer="bobyqa"))

>

>

> This way I am not accounting at all for the autocorrelation, do you have

> any suggestions?

> Thanks again,

>

> Simone

>

> ------------------------------

> Date: Sun, 15 Nov 2015 16:41:28 +0100

> Subject: Re: [R-sig-ME] glmer random effects structure: a case

> From: M.Fairbrother at bristol.ac.uk

> To: miseno77 at hotmail.com

> CC: r-sig-mixed-models at r-project.org

>

>

> Dear Simone,

> How many INDs and DATEs are in your dataset? It sounds like you have

> plenty of INDs, but it's less clear how many DATEs you have. If you have a

> lot, you may have a situation of cross-classification: observations are

> nested both in INDs and DATEs, but neither of those is nested in the either.

> If you don't have many DATEs, it will make more sense to use fixed effects

> for those. And even if you have a lot, if the DATEs are located close to

> each other in time, you may have a lot of autocorrelation, and that

> requires other techniques. (In R, you may need to use the older package

> nlme, which allows for correlated residuals.)

> In any event, if INDs are in many cases captured on multiple DATEs, it

> definitely doesn't make sense to nest INDs in DATEs. Clearly they aren't

> nested. (Assuming I've understood your data structure correctly.)

> It also sounds like you should be centering Var2 by IND. This is pretty

> much de rigueur in multilevel models with x variables that vary within

> clusters. Enders and Tofighi 2007 is a useful, clear paper on this issue,

> and you might also want to look at these recent papers by me and colleagues

> in the Centre for Multilevel Modelling at Bristol:

> doi:10.1017/psrm.2014.7

> doi:10.1017/psrm.2013.24

> Basically, take the mean of Var2 for each individual, and enter that as a

> covariate. Then take the difference between the original Var2 and its mean

> for that individual, and enter that as a covariate as well. You'll get two

> pieces of information in your fitted model: the distinct "between" and

> "within" effects of Var2. It sounds like that is what you want.

> Hope that's useful.

> - Malcolm

>

>

>

> Date: Sat, 14 Nov 2015 17:48:46 +0100

> From: Simone <miseno77 at hotmail.com>

> Cc: "r-sig-mixed-models at r-project.org"

>         <r-sig-mixed-models at r-project.org>

> Subject: [R-sig-ME] glmer random effects structure: a case

>

>

> Hi all,

> I have a simple (but not that simple to me) question on how to specify the

> random structure in R.A binary response variable (Var1) has been measured

> from a number of individuals (IND) that have been susceptible of being

> captured over a number of dates (DATE). I suspect that Var1 might depend

> either on its sex (SEX), or age (AGE) or Var2 which is a continuous

> variable measured from each individual every time it is captured. Since

> Var2 is a measure of the quality of each individual, it is likely that some

> individuals will tend to have greater values of Var2 than others during the

> entire study period.Note that some individuals have been captured only one

> time, other two, other three and so on (very unbalanced). For each date an

> individual can be captured only one time.So, I have two groups: IND and

> DATE. I would think this is a two-level model with IND nested to DATE so

> that:

> model1 <- glmer(Var1 ~ SEX + AGE + Var2 + (1|DATE/IND), family = binomial,

> data = mydata)

> Does it make sense? I think i am not taking into account the fact that the

> mean of Var2 may be different among individuals but I don't know how to do

> that.I would really appreciate an answer to this question that I am sure

> would help me a lot to understand better how mixed models work.

>

>

>

        [[alternative HTML version deleted]]

_______________________________________________

R-sig-mixed-models at r-project.org mailing list

https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]