[R-sig-ME] Meaning of perfect correlation between, by-intercept and by-slope adjustments

Petar Milin pmilin at ff.uns.ac.rs
Tue May 17 11:56:52 CEST 2011


> Message: 5
> Date: Sat, 14 May 2011 08:48:51 -0400
> From: Ben Bolker<bbolker at gmail.com>
> To:r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Meaning of perfect correlation between
> 	by-intercept and by-slope adjustments
> Message-ID:<4DCE7A33.8090601 at gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On 11-05-16 02:09 PM, Petar Milin wrote:
>> >  On Fri, May 13, 2011 at 10:51 PM, Douglas Bates<bates at stat.wisc.edu>  wrote:
>>> >>  On Fri, May 13, 2011 at 3:32 PM, Petar Milin<pmilin at ff.uns.ac.rs>  wrote:
>>>> >>>
>>>> >>>  On 13/05/11 22:00, Douglas Bates wrote:
>>>>> >>>>
>>>>> >>>>  On Fri, May 13, 2011 at 12:35 PM, Petar Milin<pmilin at ff.uns.ac.rs>   wrote:
>>>>>> >>>>>
>>>>>> >>>>>  Hello! Simplified model that I have is:
>>>>>> >>>>>  lmer(Y ~ F1 + F2 + C1 + (1+F1|participants) + (1|items))
>>>>>> >>>>>  F1 and F2 are categorical predictors (factors) and C1 is a covariable
>>>>>> >>>>>  (continuous predictor). F1 has five levels.
>>>>>> >>>>>  By-participant adjustments for F1 are justified (likelihood ratio test is
>>>>>> >>>>>  highly significant). However, what puzzles me is perfect correlation
>>>>>> >>>>>  between
>>>>>> >>>>>  two levels of F1. Others are quite high, but not perfect. I wonder what
>>>>>> >>>>>  this
>>>>>> >>>>>  means, exactly? Is there some "lack of information" which leads to
>>>>>> >>>>>  problems
>>>>>> >>>>>  in estimating variances?
>>>>> >>>>
>>>>> >>>>  I think of the estimation criterion for mixed models (the REML
>>>>> >>>>  criterion or the deviance) as being like a smoothing criterion that
>>>>> >>>>  seeks to balance complexity of the model versus fidelity to the data.
>>>>> >>>>  It happens that models in which the variance covariance matrix of the
>>>>> >>>>  random effects is singular or nearly singular are considered to have
>>>>> >>>>  low complexity so the criterion will push the optimization to that
>>>>> >>>>  extreme when doing so does not introduce substantially worse fits.
>>>>> >>>>
>>>>> >>>>  One way around this is to avoid fitting models with vector-valued
>>>>> >>>>  random effects and, instead, use two terms with simple scalar random
>>>>> >>>>  effects, as in
>>>>> >>>>
>>>>> >>>>  lmer(Y ~ F1 + F2 + C1 + (1|participants) + (1|F1:participants) +
>>>>> >>>>  (1|items))
>>>> >>>
>>>> >>>  I am always hesitant to go for scalar version. As far as I understand, this
>>>> >>>  implies homoscedasticity across levels of F1, but correct me if I am wrong.
>>>> >>>  In my model, I am not sure if that would be correct.
>>> >>
>>> >>  You are correct.  However, the model with vector-valued random effects
>>> >>  is not supported by the data in the sense that it converges to a
>>> >>  singular variance-covariance matrix.  When you have 5 random effects
>>> >>  associated with each level of participant and you allow the 5 by 5
>>> >>  positive semi-definite variance-covariance matrix you are attempting
>>> >>  to estimate 15 variance parameters for that one term.  You need a lot
>>> >>  of data to be able to do that.
>>> >>
>> >  
>> >  I am reading various stuff, trying to understand and cope with this
>> >  properly. Bottom line, using vector-valued random effects, in the
>> >  above case -- with a perfec correlation between random adjustments,
>> >  would be a case of overfitting?
>> >  
>    I think so.
>    If you wanted a justification for dropping back to the homoscedastic
> model, you could compare the likelihoods of the heteroscedastic and
> homoscedastic model fits, which you can probably establish are a pair of
> nested models (and whose likelihoods may actually be identical).
>

I forgot to mention, but I did likelihood ratio test, immediately after 
Doug's suggestion.
However, conceptually, I do not like to compare a model that is 
suspected for overfitting with some/any other model. I wonder if that is 
correct at all: AIC, BIC and logLik are measures of goodness-of-fit, and 
a particular fit is "wrong", so to say.
Furthermore, I am getting better fit for the model that uses 
vector-valued random effect, which, now we know, overfits.
Honestly, I wonder whether I should go for likelihood ratio, if 
variance/covariance matrix of random effects is singular or nearly singular?


Many thanks for the great discussion!
Best,
Petar




More information about the R-sig-mixed-models mailing list