[R-sig-ME] Mixed-models and condition number
Christina Bogner
christina.bogner at uni-bayreuth.de
Thu Feb 5 09:12:49 CET 2009
Dear Stephan,
thank your very much for your response and the detailed list of
literature. I knew about Belsley (1991b) and used it on the design
matrix of the fixed-effects. My (absolutely empirical) results were the
following:
on a small data set of 58 values and the mixed-effects model:
mymodel=lme(log(calcium) ~ soil.horizon+flow.region+content.of.silt,
data=mydata, random=~1|plot)
with soil.horizon and flow.region: factors
content.of.silt: continuous covariate
1. mean-centering the continuous covariate decreased the collinearity
between the intercept term and the continuous covariate
(summary.lme$corFixed and Belsley 1991b on the design matrix of
fixed-effects) and decreased kappa (of the design matrix of
fixed-effects) by factor 12.
2. scaling the covariate to obtain fixed-effects estimates of comparable
size decreased kappa by factor 4, but had no effect on correlation of
the fixed-effects.
3. I compared kappas of the mixed-effects design matrix and as proposed
by Douglas Bates of the "triangular matrix derived from the
fixed-effects model matrix after removing the random effects" in lme4:
the influence of mean-centering and scaling on kappa was comparable and
values of kappas of the triangular matrix and the design matrix of the
fixed-effects differed little for mean-centered and scaled model, but
largely for the non-scaled and non-centered one.
I will try to find a mathematician at my university who would like to
play around with mixed-models ;-).
Thanks again
Christina
Stephan Kolassa schrieb:
> Hi Christina,
> let me start by saying that I don't know of anyone looking at
> conditioning of design matrices in a mixed model environment. Might be a
> nice topic to have an M. Sc. student play around with empirically. The
> problem with ill-conditioning in fixed-effects models basically comes
> down to high variances in the parameter estimates, so one could
> actually build a mixed model with an ill-conditioned design matrix and
> play around with small changes to simulated observations, checking
> whether inferences or estimates exhibit "large" variance.
>
> If you find out anything about this, would you let me know?
>
> That said, my recent interest has been in collinearity between
> predictors, which is not exactly conditioning, but reasonably close to
> it. I'd recommend you look at Hill & Adkins (2001) and the collinearity
> diagnostics they recommend. Belsley (1991a) wrote an entire monograph
> about them, but there are also shorter introductions, e.g., Belsley
> (1991b).
>
> Scaling the columns of X to equal euclidean length (usually to length 1)
> before diagnosing collinearity appears to be accepted procedure, so I
> think scaling would be a good starting point in the mixed model, too.
> However, there is a discussion as to whether to first remove the
> constant column from X and subtract the column mean from each of the
> remaining columns.
>
> Marquardt (1980) claims that centering removes "nonessential ill
> conditioning." Weisberg (1980) and Montgomery and Peck (1982) also
> advocate centering.
>
> Other practitioners maintain that centering removes meaningful
> information from X, such as collinearity with the constant column, and
> should not be used (Belsey et al., 1980; Belsley, 1984a, 1984b, 1986,
> 1991a, 1991b; Echambadi & Hess, 2007; Hill & Adkins, 2001). For
> example, Simon and Lesage (1988) found that collinearity with the
> constant
> column introduces numerical instability, which is mitigated but not
> prevented by employing collinearity diagnostics after centering X. In
> addition, these problems are not confined to the constant coefficient,
> but extend to all estimates.
>
> For a very lively debate on this topic see Belsley (1984a); Cook (1984);
> Gunst (1984); Snee and Marquardt (1984); Wood (1984); Belsley (1984b).
> The consensus seems to be that centering cannot be once and for all be
> advised or rejected; rather, whether or not to center data depends on
> the problem one is facing.
>
> HTH,
> Stephan
>
>
> * Belsey, D. A., Kuh, E., & Welsch, R. E. (1980). Regression
> Diagnostics: Identifying Influential Data and Sources of Collinearity.
> New York, NY: John Wiley & Sons.
>
> * Belsley, D. A. (1984a, May). Demeaning Conditioning Diagnostics
> through Centering. The American Statistician, 38(2), 73-77.
>
> * Belsley, D. A. (1984b, May). Demeaning Conditioning Diagnostics
> through Centering: Reply. The American Statistician, 38(2), 90-93.
>
> * Belsley, D. A. (1986). Centering, the constant, first-differencing,
> and assessing conditioning. In E. Kuh & D. A. Belsley (Eds.), Model
> Reliability (p. 117-153). Cambridge: MIT Press.
>
> * Belsley, D. A. (1987). Collinearity and Least Squares Regression:
> Comment -- Well-Conditioned Collinearity Indices. Statistical Science,
> 2(1), 86-91. Available from http://projecteuclid.org/euclid.ss/1177013441
>
> * Belsley, D. A. (1991a). Conditioning Diagnostics: Collinearity and
> Weak Data in Regression. New York, NY: Wiley.
>
> * Belsley, D. A. (1991b, February). A Guide to using the collinearity
> diagnostics. Computational Economics, 4(1), 33-50. Available from
> http://www.springerlink.com/content/v135h6631x412kk8/
>
> * Cook, R. D. (1984, May). Demeaning Conditioning Diagnostics through
> Centering: Comment. The American Statistician, 38(2), 78-79.
>
> * Echambadi, R., & Hess, J. D. (2007, May-June). Mean-Centering Does
> Not Alleviate Collinearity Problems in Moderated Multiple Regression
> Models. Marketing Science, 26(3), 438-445.
>
> * Golub, G. H., & Van Loan, C. F. (1996). Matrix Computations (3rd
> ed.). Baltimore: Johns Hopkins University Press.
>
> * Gunst, R. F. (1984, May). Comment: Toward a Balanced Assessment of
> Collinearity Diagnostics. The American Statistician, 38(2), 79-82.
>
> * Hill, R. C., & Adkins, L. C. (2001). Collinearity. In B. H. Baltagi
> (Ed.), A Companion to Theoretical Econometrics (p. 256-278). Oxford:
> Blackwell.
>
> * Marquardt, D. W. (1987). Collinearity and Least Squares Regression:
> Comment. Statistical Science, 2(1), 84-85. Available from
> http://projecteuclid.org/euclid.ss/1177013440
>
> * Montgomery, D. C., & Peck, E. A. (1982). Introduction to Linear
> Regression Analysis. New York, NY: John Wiley.
>
> * Simon, S. D., & Lesage, J. P. (1988, January). The impact of
> collinearity involving the intercept term on the numerical accuracy of
> regression. Computational Economics (formerly Computer Science in
> Economics and Management), 1(2), 137-152.
>
> * Snee, R. D., & Marquardt, D. W. (1984, May). Comment: Collinearity
> Diagnostics Depend on the Domain of Prediction, the Model, and the
> Data. The American Statistician, 38(2), 83-87.
>
> * Weisberg, S. (1980). Applied Linear Regression. New York, NY: John
> Wiley.
>
> * Wood, F. S. (1984, May). Comment: Effect of Centering on
> Collinearity and Interpretation of the Constant. The American
> Statistician, 38(2), 88-90.
>
>
>
> Christina Bogner schrieb:
>> Dear list members,
>>
>> I'm working with both nlme and lme4 packages trying to fit linear
>> mixed-models to soil chemical and physical data. I know that for
>> linear models one can calculate the condition number kappa of the
>> model matrix to know whether the problem is well- or ill-conditioned.
>> Does it make any sense to compute kappa on the design matrix of the
>> fixed-effects in nlme or lme4? For comparison I fitted a simple
>> linear model to my data and scaling some numerical predictors
>> decreased kappa considerably. So I wonder if scaling them in the
>> mixed-model has any advantages?
>>
>> Thanks a lot for your help.
>>
>> Christina Bogner
>>
>
More information about the R-sig-mixed-models
mailing list