[R-sig-ME] fixed effects correlated with the intercept

John Maindonald john.maindonald at anu.edu.au
Fri Mar 23 23:54:58 CET 2007


Is'nt this what might be expected.  Center the covariate about
its mean and, depending on the detailed variance-covariance
structure, the correlation may well reduce to zero.

Check this out with a model created using lm(), where it is
easier to follow the detail.  If you write the model  y = a + b(x -  
mean(x)),
the estimates of a and b are uncorrelated.  If x is not centered,
then you have
y = a - b mean(x))] + bx = adash + bx.

Then
adash = a - b mean(x)
involves b, and is clearly correlated with b. By making mean(x)
large enough or small enough, the correlation can be made
arbitrarily close to -1 or 1, respectively.

What do you mean when you say "I have two covariates that I
consider to be controls in my model." Do you mean that these
code for observations that you are treating as controls?  Or
what?

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 24 Mar 2007, at 6:07 AM, Austin Frank wrote:

> And hello again!
>
> I'm getting a result that is very confusing to me and I'm hoping for
> some advice or clarification.  I have two covariates that I consider
> to be controls in my model.  When I include either in the model, the
> fixed effect shows a strong correlation ( > .85 ) with the intercept.
> The result of including these factors is that the estimated intercept
> is much lower than I would expect.  Is there any conclusion to be
> drawn from these correlations?  Normally when I see correlations among
> fixed effects I worry about collinearity.  I'm absolutely confused
> about what it would mean for a covariate to be collinear with the
> estimated population mean.  Any help is appreciated in clearing this
> up.
>
> It's possible that the appropriate conclusion is that I'm overfitting.
> I'm not sure this is the case.  The degrees of freedom in the model is
> still relatively low compared to the number of data points (12 df on
> ~2500 observations).  Is overfitting still the most likely culprit?
>
> One attempt at dealing with the above problem was to remove the
> intercept from the model.  This causes lmer to estimate a coefficient
> for each of the levels in the first factor in the model.  I think that
> this treatment did not resolve whatever problem there is with these
> two covariates-- now instead of being correlated with the intercept,
> they are correlated with both levels of the split factor.
>
> While this approach didn't resolve my original issue, it did bring up
> a few others.  First of all, the coef() method fails on a model with
> no intercept for the fixed effects, giving the error "unable to align
> random and fixed effects".  Is this a known issue?  Is there a
> workaround?
>
> Second, while the estimates for both levels of the split factor are
> shown to be significantly different from zero using mcmcsamp, I'm
> still interested in whether there is a difference between the two
> levels.  What's the appropriate test to check the null hypothesis that
> the difference between the two parameter estimates is zero?
>
> Thanks again,
> /au
>
> -- 
> Austin Frank
> http://aufrank.net
> GPG Public Key (D7398C2F): http://aufrank.net/personal.asc
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list