[R-sig-ME] Model diagnostics show slope in residuals plot and slope on the observed vs fitted plot is different than y = x

Thierry Onkelinx thierry.onkelinx at inbo.be
Mon Oct 3 11:14:36 CEST 2016


Dear Carlos,

Can you send us the dataset? I have some more questions on the data and
have the data would be easier to look into this problem.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2016-10-03 11:08 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:

> Dear Thierry,
>
> If that is the case, would the initial model be of any use for inference
> given that I have no other data or covariate and most likely I won’t be
> able to get it?
>
> Many thanks,
> Carlos Família
>
> On 3 Oct 2016, at 09:57, Thierry Onkelinx <thierry.onkelinx op inbo.be>
> wrote:
>
> Dear Carlos,
>
> Is X an other variable? Or did you ment W? The graphs give me a strong
> indication for a missing covariate.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2016-10-03 10:51 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:
>
>> Dear Thierry,
>>
>> The image can be found here https://s4.postimg.org/lj
>> 5xf0rpp/Screen_Shot_2016_10_03_at_09_44_28.png
>>
>> Let me add another thing to the discussion, I was trying different
>> models, and I tried the following
>>
>> lmer( Y ~ X + (1 | C), data = df)
>>
>> For which the residuals are distributed in a form I was expecting,
>> however I am missing the part of the same individual being measured for
>> different conditions, the plots can be found here,
>> https://s25.postimg.org/oupckrapr/Screen_Shot_2016_10_03_at_09_49_20.png
>>
>> Thank you,
>> Carlos Família
>>
>>
>>
>> On 3 Oct 2016, at 09:40, Thierry Onkelinx <thierry.onkelinx op inbo.be>
>> wrote:
>>
>> Dear Carlos,
>>
>> Can you show us a plot of the residuals versus W for each level of C? It
>> looks like either the relation of Y and W is not linear, or you are missing
>> an important covariate.
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>> and Forest
>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>> Kliniekstraat 25
>> 1070 Anderlecht
>> Belgium
>>
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> 2016-10-03 10:34 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:
>>
>>> Hello,
>>>
>>> The image can be found here https://s18.postimg.org/r
>>> bx2vh2ex/Pasted_Graphic_4.png
>>>
>>> Best regards,
>>> Carlos Família
>>>
>>> On 3 Oct 2016, at 08:50, Thierry Onkelinx <thierry.onkelinx op inbo.be>
>>> wrote:
>>>
>>> Dear Carlos,
>>>
>>> Your plot got stripped from your mail. Try sending it as pdf or put it
>>> someone online and send us the URL.
>>>
>>> Best regards,
>>>
>>> ir. Thierry Onkelinx
>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
>>> and Forest
>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>> Kliniekstraat 25
>>> 1070 Anderlecht
>>> Belgium
>>>
>>> To call in the statistician after the experiment is done may be no more
>>> than asking him to perform a post-mortem examination: he may be able to say
>>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>>> The plural of anecdote is not data. ~ Roger Brinner
>>> The combination of some data and an aching desire for an answer does not
>>> ensure that a reasonable answer can be extracted from a given body of data.
>>> ~ John Tukey
>>>
>>> 2016-10-02 17:57 GMT+02:00 Carlos Familia <carlosfamilia op gmail.com>:
>>>
>>>> Hello,
>>>>
>>>> I have in hands a quite large and unbalanced dataset, for which a Y
>>>> continuous dependent variable was measured in 3 different conditions (C)
>>>> for about 3000 subjects (ID) (although, not all subjects have Y values for
>>>> the 3 conditions). Additionally, there is continuous measure W which was
>>>> measured for all subjects.
>>>>
>>>> I am interested in testing the following:
>>>>
>>>> - Is the effect of W significant overall
>>>> - Is the effect of W significant at each level of C
>>>> - Is the effect of C significant
>>>>
>>>> In order to try to answer this, I have specified the following model
>>>> with lmer:
>>>>
>>>> lmer( Y ~ W * C + (1 | ID), data = df)
>>>>
>>>> Which seems to proper reflect the structure of the data (I might be
>>>> wrong here, any suggestions would be welcome).
>>>> However when running the diagnostic plots I noticed a slope in the
>>>> residuals plot and a slope different than y = x for the observed vs fitted
>>>> plot (as shown bellow). Which made me question the validity of the model
>>>> for inference.
>>>>
>>>> Could I still use this model for inference? Should I specify a
>>>> different formula? Should I turn to lme and try to include different
>>>> variances for each level of conditions (C)? Any ideas?
>>>>
>>>> I would be really appreciated if anyone could help me with this.
>>>>
>>>> Thanks in advance,
>>>> Carlos Família
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models op r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>>
>>>
>>
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list