[R-sig-ME] Meaning of perfect correlation between by-intercept and by-slope adjustments

Mon May 16 20:09:07 CEST 2011

On Fri, May 13, 2011 at 10:51 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> On Fri, May 13, 2011 at 3:32 PM, Petar Milin <pmilin at ff.uns.ac.rs> wrote:
>>
>> On 13/05/11 22:00, Douglas Bates wrote:
>>>
>>> On Fri, May 13, 2011 at 12:35 PM, Petar Milin<pmilin at ff.uns.ac.rs>  wrote:
>>>>
>>>> Hello! Simplified model that I have is:
>>>> lmer(Y ~ F1 + F2 + C1 + (1+F1|participants) + (1|items))
>>>> F1 and F2 are categorical predictors (factors) and C1 is a covariable
>>>> (continuous predictor). F1 has five levels.
>>>> By-participant adjustments for F1 are justified (likelihood ratio test is
>>>> highly significant). However, what puzzles me is perfect correlation
>>>> between
>>>> two levels of F1. Others are quite high, but not perfect. I wonder what
>>>> this
>>>> means, exactly? Is there some "lack of information" which leads to
>>>> problems
>>>> in estimating variances?
>>>
>>> I think of the estimation criterion for mixed models (the REML
>>> criterion or the deviance) as being like a smoothing criterion that
>>> seeks to balance complexity of the model versus fidelity to the data.
>>> It happens that models in which the variance covariance matrix of the
>>> random effects is singular or nearly singular are considered to have
>>> low complexity so the criterion will push the optimization to that
>>> extreme when doing so does not introduce substantially worse fits.
>>>
>>> One way around this is to avoid fitting models with vector-valued
>>> random effects and, instead, use two terms with simple scalar random
>>> effects, as in
>>>
>>> lmer(Y ~ F1 + F2 + C1 + (1|participants) + (1|F1:participants) +
>>> (1|items))
>>
>> I am always hesitant to go for scalar version. As far as I understand, this
>> implies homoscedasticity across levels of F1, but correct me if I am wrong.
>> In my model, I am not sure if that would be correct.
>
> You are correct.  However, the model with vector-valued random effects
> is not supported by the data in the sense that it converges to a
> singular variance-covariance matrix.  When you have 5 random effects
> associated with each level of participant and you allow the 5 by 5
> positive semi-definite variance-covariance matrix you are attempting
> to estimate 15 variance parameters for that one term.  You need a lot
> of data to be able to do that.
>

I am reading various stuff, trying to understand and cope with this
properly. Bottom line, using vector-valued random effects, in the
above case -- with a perfec correlation between random adjustments,
would be a case of overfitting?

Thanks!
Petar