[R-sig-ME] LMM diagnostics: conditional residuals correlated highly with fitted values
Yizhou Ma
maxxx848 at umn.edu
Wed Oct 7 17:09:07 CEST 2015
Hi Thierry,
Thank you for your reply and sorry for the HTML thing. Below is my
summary(model) output.
Y, Drink, and Age are continuous variables
Gender is F & M.
Family_ID is a factor.
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: Y ~ Drink * Gender + Age + (1 | Family_ID)
Data: data
AIC BIC logLik deviance df.resid
1046.4 1074.0 -516.2 1032.4 372
Scaled residuals:
Min 1Q Median 3Q Max
-2.67228 -0.56085 -0.02968 0.66166 2.91452
Random effects:
Groups Name Variance Std.Dev.
Family_ID (Intercept) 0.3550 0.5958
Residual 0.6162 0.7850
Number of obs: 379, groups: Family_ID, 189
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.10309 0.43921 2.511
Drink 0.16425 0.08031 2.045
Gender.M -0.19364 0.10874 -1.781
Age -0.03377 0.01489 -2.268
Drink:Gender.M -0.13647 0.10681 -1.278
Correlation of Fixed Effects:
(Intr) Drnk Gndr.M Age
Drink -0.098
Gender.M -0.040 -0.249
Age -0.985 0.158 -0.054
Drnk:G.M 0.042 -0.737 -0.021 -0.085
Thank you very much,
Cherry
On Wed, Oct 7, 2015 at 5:14 AM, Thierry Onkelinx
<thierry.onkelinx at inbo.be> wrote:
> Dear Cherry,
>
> Please don't post in HTML. Have a look at the posting guide.
>
> You'll need to provide more information. What is the class of each variable
> (continuous, count, presence/absence, factor, ...)? What is the output of
> summary(model)?
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more than
> asking him to perform a post-mortem examination: he may be able to say what
> the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2015-10-06 17:15 GMT+02:00 Yizhou Ma <maxxx848 at umn.edu>:
>>
>> Dear LMM experts:
>>
>> I am pretty new to using LMM and I have found the following situation
>> bewildering as I was trying to do diagnostics with my fitted model: my
>> conditional residuals correlated highly with the fitted values.
>>
>> I have a dataset with multiple families, each has 1-4 siblings. I am
>> trying
>> to regress Y onto EVs include Drink, Gender, & Age, while using random
>> intercept for family. This is the model I used:
>> model<-lmer(Y~Drink*Gender+Age
>> +(1|Family_ID),data,REML=FALSE)
>>
>> After fitting the model, I used
>> plot(model)
>> to see the relationship between conditional residuals and fitted values. I
>> expect them to be uncorrelated and I expect to see homoscedasticity.
>>
>> Yet to my surprise there is a high correlation (~0.5) between the
>> residuals
>> and the fitted values. (see here <http://imgur.com/pPsG4aR>). I know from
>> GLM that this usually suggest nonlinear relationships between the EVs and
>> the DV.
>>
>> I read some online posts (post1
>>
>> <http://stats.stackexchange.com/questions/43566/strange-pattern-in-residual-plot-from-mixed-effect-model>
>> post2
>>
>> <http://stats.stackexchange.com/questions/168179/correlation-between-standardized-residuals-and-fitted-values-in-a-linear-mixed-e/168210#168210>)
>> that suggest this can result from a poor model fit. So I tried a few
>> different models, including: 1) log transform Drink, which is originally
>> positively skewed; 2) add random slopes for Drink, Age, etc. None of these
>> changes have led to a substantial difference for the residual & fitted
>> value correlation.
>>
>> Some other info:
>> 1) my overall model fit is not poor as indicated by the correlation
>> between
>> fitted values & Y. It is around 0.8;
>> 2) most variables in my model has a normal, or at least symmetrical,
>> distribution.
>> 3) conditional residuals are normally distributed as shown in qqplots.
>> 4) conditional residuals are not correlated with any fixed effects, such
>> as
>> Drink or Age.
>>
>> I have two guesses as to what is going on:
>> 1) maybe the fact that each family is a different size actually violates
>> assumptions of the model?
>> 2) or maybe there is something wrong with estimation of the random effect
>> (family intercept)?
>>
>> I'd really appreciate your insights as to what is going on here and if
>> there is any problems with my model.
>>
>> Thank you very much,
>> Cherry
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
More information about the R-sig-mixed-models
mailing list