[R-sig-ME] LMM diagnostics: conditional residuals correlated highly with fitted values

Wed Oct 7 17:15:21 CEST 2015

Can you elaborate on what Y is? Does it has a lower boundary? And if so, do
you have observations near that boundary? E.g. Y must be non-negative and
the dataset contains observations close to 0. A densityplot would be useful.

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-10-07 17:09 GMT+02:00 Yizhou Ma <maxxx848 op umn.edu>:

> Hi Thierry,
>
> Thank you for your reply and sorry for the HTML thing. Below is my
> summary(model) output.
>
> Y, Drink, and Age are continuous variables
> Gender is F & M.
> Family_ID is a factor.
>
> Linear mixed model fit by maximum likelihood  ['lmerMod']
> Formula: Y ~ Drink * Gender + Age + (1 | Family_ID)
>    Data: data
>
>      AIC      BIC   logLik deviance df.resid
>   1046.4   1074.0   -516.2   1032.4      372
>
> Scaled residuals:
>      Min       1Q   Median       3Q      Max
> -2.67228 -0.56085 -0.02968  0.66166  2.91452
>
> Random effects:
>  Groups    Name        Variance Std.Dev.
>  Family_ID (Intercept) 0.3550   0.5958
>  Residual                    0.6162   0.7850
> Number of obs: 379, groups:  Family_ID, 189
>
> Fixed effects:
>                           Estimate Std. Error t value
> (Intercept)          1.10309    0.43921   2.511
> Drink                  0.16425    0.08031   2.045
> Gender.M          -0.19364    0.10874  -1.781
> Age                    -0.03377    0.01489  -2.268
> Drink:Gender.M -0.13647    0.10681  -1.278
>
> Correlation of Fixed Effects:
>                 (Intr)     Drnk   Gndr.M  Age
> Drink        -0.098
> Gender.M -0.040 -0.249
> Age           -0.985  0.158 -0.054
> Drnk:G.M  0.042 -0.737 -0.021 -0.085
>
> Thank you very much,
> Cherry
>
> On Wed, Oct 7, 2015 at 5:14 AM, Thierry Onkelinx
> <thierry.onkelinx op inbo.be> wrote:
> > Dear Cherry,
> >
> > Please don't post in HTML. Have a look at the posting guide.
> >
> > You'll need to provide more information. What is the class of each
> variable
> > (continuous, count, presence/absence, factor, ...)? What is the output of
> > summary(model)?
> >
> > Best regards,
> >
> > ir. Thierry Onkelinx
> > Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and
> > Forest
> > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> > Kliniekstraat 25
> > 1070 Anderlecht
> > Belgium
> >
> > To call in the statistician after the experiment is done may be no more
> than
> > asking him to perform a post-mortem examination: he may be able to say
> what
> > the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of
> data.
> > ~ John Tukey
> >
> > 2015-10-06 17:15 GMT+02:00 Yizhou Ma <maxxx848 op umn.edu>:
> >>
> >> Dear LMM experts:
> >>
> >> I am pretty new to using LMM and I have found the following situation
> >> bewildering as I was trying to do diagnostics with my fitted model: my
> >> conditional residuals correlated highly with the fitted values.
> >>
> >> I have a dataset with multiple families, each has 1-4 siblings. I am
> >> trying
> >> to regress Y onto EVs include Drink, Gender, & Age, while using random
> >> intercept for family. This is the model I used:
> >> model<-lmer(Y~Drink*Gender+Age
> >>                       +(1|Family_ID),data,REML=FALSE)
> >>
> >> After fitting the model, I used
> >> plot(model)
> >> to see the relationship between conditional residuals and fitted
> values. I
> >> expect them to be uncorrelated and I expect to see homoscedasticity.
> >>
> >> Yet to my surprise there is a high correlation (~0.5) between the
> >> residuals
> >> and the fitted values. (see here <http://imgur.com/pPsG4aR>). I know
> from
> >> GLM that this usually suggest nonlinear relationships between the EVs
> and
> >> the DV.
> >>
> >> I read some online posts (post1
> >>
> >> <
> http://stats.stackexchange.com/questions/43566/strange-pattern-in-residual-plot-from-mixed-effect-model
> >
> >> post2
> >>
> >> <
> http://stats.stackexchange.com/questions/168179/correlation-between-standardized-residuals-and-fitted-values-in-a-linear-mixed-e/168210#168210
> >)
> >> that suggest this can result from a poor model fit. So I tried a few
> >> different models, including: 1) log transform Drink, which is originally
> >> positively skewed; 2) add random slopes for Drink, Age, etc. None of
> these
> >> changes have led to a substantial difference for the residual & fitted
> >> value correlation.
> >>
> >> Some other info:
> >> 1) my overall model fit is not poor as indicated by the correlation
> >> between
> >> fitted values & Y. It is around 0.8;
> >> 2) most variables in my model has a normal, or at least symmetrical,
> >> distribution.
> >> 3) conditional residuals are normally distributed as shown in qqplots.
> >> 4) conditional residuals are not correlated with any fixed effects, such
> >> as
> >> Drink or Age.
> >>
> >> I have two guesses as to what is going on:
> >> 1) maybe the fact that each family is a different size actually violates
> >> assumptions of the model?
> >> 2) or maybe there is something wrong with estimation of the random
> effect
> >> (family intercept)?
> >>
> >> I'd really appreciate your insights as to what is going on here and if
> >> there is any problems with my model.
> >>
> >> Thank you very much,
> >> Cherry
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> R-sig-mixed-models op r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> >
>

	[[alternative HTML version deleted]]