[R-sig-ME] Model diagnostics show slope in residuals plot and slope on the observed vs fitted plot is different than y = x

Thierry Onkelinx thierry.onkelinx at inbo.be
Mon Oct 3 10:57:53 CEST 2016


Dear Carlos,

Is X an other variable? Or did you ment W? The graphs give me a strong
indication for a missing covariate.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-10-03 10:51 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:

> Dear Thierry,
>
> The image can be found here https://s4.postimg.org/
> lj5xf0rpp/Screen_Shot_2016_10_03_at_09_44_28.png
>
> Let me add another thing to the discussion, I was trying different models,
> and I tried the following
>
> lmer( Y ~ X + (1 | C), data = df)
>
> For which the residuals are distributed in a form I was expecting, however
> I am missing the part of the same individual being measured for different
> conditions, the plots can be found here, https://s25.postimg.org/
> oupckrapr/Screen_Shot_2016_10_03_at_09_49_20.png
>
> Thank you,
> Carlos Família
>
>
>
> On 3 Oct 2016, at 09:40, Thierry Onkelinx <thierry.onkelinx op inbo.be>
> wrote:
>
> Dear Carlos,
>
> Can you show us a plot of the residuals versus W for each level of C? It
> looks like either the relation of Y and W is not linear, or you are missing
> an important covariate.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2016-10-03 10:34 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:
>
>> Hello,
>>
>> The image can be found here https://s18.postimg.org/r
>> bx2vh2ex/Pasted_Graphic_4.png
>>
>> Best regards,
>> Carlos Família
>>
>> On 3 Oct 2016, at 08:50, Thierry Onkelinx <thierry.onkelinx op inbo.be>
>> wrote:
>>
>> Dear Carlos,
>>
>> Your plot got stripped from your mail. Try sending it as pdf or put it
>> someone online and send us the URL.
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> 2016-10-02 17:57 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:
>>
>>> Hello,
>>>
>>> I have in hands a quite large and unbalanced dataset, for which a Y
>>> continuous dependent variable was measured in 3 different conditions (C)
>>> for about 3000 subjects (ID) (although, not all subjects have Y values for
>>> the 3 conditions). Additionally, there is continuous measure W which was
>>> measured for all subjects.
>>>
>>> I am interested in testing the following:
>>>
>>> - Is the effect of W significant overall
>>> - Is the effect of W significant at each level of C
>>> - Is the effect of C significant
>>>
>>> In order to try to answer this, I have specified the following model
>>> with lmer:
>>>
>>> lmer( Y ~ W * C + (1 | ID), data = df)
>>>
>>> Which seems to proper reflect the structure of the data (I might be
>>> wrong here, any suggestions would be welcome).
>>> However when running the diagnostic plots I noticed a slope in the
>>> residuals plot and a slope different than y = x for the observed vs fitted
>>> plot (as shown bellow). Which made me question the validity of the model
>>> for inference.
>>>
>>> Could I still use this model for inference? Should I specify a different
>>> formula? Should I turn to lme and try to include different variances for
>>> each level of conditions (C)? Any ideas?
>>>
>>> I would be really appreciated if anyone could help me with this.
>>>
>>> Thanks in advance,
>>> Carlos Família
>>>
>>> _______________________________________________
>>> R-sig-mixed-models op r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list