[R-sig-ME] Meaning of perfect correlation between, by-intercept and by-slope adjustments
Petar Milin
pmilin at ff.uns.ac.rs
Tue May 17 11:56:52 CEST 2011
> Message: 5
> Date: Sat, 14 May 2011 08:48:51 -0400
> From: Ben Bolker<bbolker at gmail.com>
> To:r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Meaning of perfect correlation between
> by-intercept and by-slope adjustments
> Message-ID:<4DCE7A33.8090601 at gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On 11-05-16 02:09 PM, Petar Milin wrote:
>> > On Fri, May 13, 2011 at 10:51 PM, Douglas Bates<bates at stat.wisc.edu> wrote:
>>> >> On Fri, May 13, 2011 at 3:32 PM, Petar Milin<pmilin at ff.uns.ac.rs> wrote:
>>>> >>>
>>>> >>> On 13/05/11 22:00, Douglas Bates wrote:
>>>>> >>>>
>>>>> >>>> On Fri, May 13, 2011 at 12:35 PM, Petar Milin<pmilin at ff.uns.ac.rs> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Hello! Simplified model that I have is:
>>>>>> >>>>> lmer(Y ~ F1 + F2 + C1 + (1+F1|participants) + (1|items))
>>>>>> >>>>> F1 and F2 are categorical predictors (factors) and C1 is a covariable
>>>>>> >>>>> (continuous predictor). F1 has five levels.
>>>>>> >>>>> By-participant adjustments for F1 are justified (likelihood ratio test is
>>>>>> >>>>> highly significant). However, what puzzles me is perfect correlation
>>>>>> >>>>> between
>>>>>> >>>>> two levels of F1. Others are quite high, but not perfect. I wonder what
>>>>>> >>>>> this
>>>>>> >>>>> means, exactly? Is there some "lack of information" which leads to
>>>>>> >>>>> problems
>>>>>> >>>>> in estimating variances?
>>>>> >>>>
>>>>> >>>> I think of the estimation criterion for mixed models (the REML
>>>>> >>>> criterion or the deviance) as being like a smoothing criterion that
>>>>> >>>> seeks to balance complexity of the model versus fidelity to the data.
>>>>> >>>> It happens that models in which the variance covariance matrix of the
>>>>> >>>> random effects is singular or nearly singular are considered to have
>>>>> >>>> low complexity so the criterion will push the optimization to that
>>>>> >>>> extreme when doing so does not introduce substantially worse fits.
>>>>> >>>>
>>>>> >>>> One way around this is to avoid fitting models with vector-valued
>>>>> >>>> random effects and, instead, use two terms with simple scalar random
>>>>> >>>> effects, as in
>>>>> >>>>
>>>>> >>>> lmer(Y ~ F1 + F2 + C1 + (1|participants) + (1|F1:participants) +
>>>>> >>>> (1|items))
>>>> >>>
>>>> >>> I am always hesitant to go for scalar version. As far as I understand, this
>>>> >>> implies homoscedasticity across levels of F1, but correct me if I am wrong.
>>>> >>> In my model, I am not sure if that would be correct.
>>> >>
>>> >> You are correct. However, the model with vector-valued random effects
>>> >> is not supported by the data in the sense that it converges to a
>>> >> singular variance-covariance matrix. When you have 5 random effects
>>> >> associated with each level of participant and you allow the 5 by 5
>>> >> positive semi-definite variance-covariance matrix you are attempting
>>> >> to estimate 15 variance parameters for that one term. You need a lot
>>> >> of data to be able to do that.
>>> >>
>> >
>> > I am reading various stuff, trying to understand and cope with this
>> > properly. Bottom line, using vector-valued random effects, in the
>> > above case -- with a perfec correlation between random adjustments,
>> > would be a case of overfitting?
>> >
> I think so.
> If you wanted a justification for dropping back to the homoscedastic
> model, you could compare the likelihoods of the heteroscedastic and
> homoscedastic model fits, which you can probably establish are a pair of
> nested models (and whose likelihoods may actually be identical).
>
I forgot to mention, but I did likelihood ratio test, immediately after
Doug's suggestion.
However, conceptually, I do not like to compare a model that is
suspected for overfitting with some/any other model. I wonder if that is
correct at all: AIC, BIC and logLik are measures of goodness-of-fit, and
a particular fit is "wrong", so to say.
Furthermore, I am getting better fit for the model that uses
vector-valued random effect, which, now we know, overfits.
Honestly, I wonder whether I should go for likelihood ratio, if
variance/covariance matrix of random effects is singular or nearly singular?
Many thanks for the great discussion!
Best,
Petar
More information about the R-sig-mixed-models
mailing list