[R-sig-ME] Mixed-models and condition number

Mon Feb 2 21:20:00 CET 2009

Hi Christina,

let me start by saying that I don't know of anyone looking at
conditioning of design matrices in a mixed model environment. Might be a
nice topic to have an M. Sc. student play around with empirically. The 
problem with ill-conditioning in fixed-effects models basically comes 
down to high variances in the parameter estimates, so one could actually 
build a mixed model with an ill-conditioned design matrix and play 
around with small changes to simulated observations, checking whether 
inferences or estimates exhibit "large" variance.

If you find out anything about this, would you let me know?

That said, my recent interest has been in collinearity between
predictors, which is not exactly conditioning, but reasonably close to
it. I'd recommend you look at Hill & Adkins (2001) and the collinearity
diagnostics they recommend. Belsley (1991a) wrote an entire monograph
about them, but there are also shorter introductions, e.g., Belsley (1991b).

Scaling the columns of X to equal euclidean length (usually to length 1)
before diagnosing collinearity appears to be accepted procedure, so I 
think scaling would be a good starting point in the mixed model, too. 
However, there is a discussion as to whether to first remove the
constant column from X and subtract the column mean from each of the 
remaining columns.

Marquardt (1980) claims that centering removes "nonessential ill
conditioning." Weisberg (1980) and Montgomery and Peck (1982) also 
advocate centering.

Other practitioners maintain that centering removes meaningful
information from X, such as collinearity with the constant column, and 
should not be used (Belsey et al., 1980; Belsley, 1984a, 1984b, 1986, 
1991a, 1991b; Echambadi & Hess, 2007; Hill & Adkins, 2001). For example, 
Simon and Lesage (1988) found that collinearity with the constant
column introduces numerical instability, which is mitigated but not 
prevented by employing collinearity diagnostics after centering X. In 
addition, these problems are not confined to the constant coefficient, 
but extend to all estimates.

For a very lively debate on this topic see Belsley (1984a); Cook (1984);
Gunst (1984); Snee and Marquardt (1984); Wood (1984); Belsley (1984b). 
The consensus seems to be that centering cannot be once and for all be 
advised or rejected; rather, whether or not to center data depends on 
the problem one is facing.

HTH,
Stephan

* Belsey, D. A., Kuh, E., & Welsch, R. E. (1980). Regression 
Diagnostics: Identifying Influential Data and Sources of Collinearity. 
New York, NY: John Wiley & Sons.

* Belsley, D. A. (1984a, May). Demeaning Conditioning Diagnostics 
through Centering. The American Statistician, 38(2), 73-77.

* Belsley, D. A. (1984b, May). Demeaning Conditioning Diagnostics 
through Centering: Reply. The American Statistician, 38(2), 90-93.

* Belsley, D. A. (1986). Centering, the constant, first-differencing, 
and assessing conditioning. In E. Kuh & D. A. Belsley (Eds.), Model 
Reliability (p. 117-153). Cambridge: MIT Press.

* Belsley, D. A. (1987). Collinearity and Least Squares Regression: 
Comment -- Well-Conditioned Collinearity Indices. Statistical Science, 
2(1), 86-91. Available from http://projecteuclid.org/euclid.ss/1177013441

* Belsley, D. A. (1991a). Conditioning Diagnostics: Collinearity and 
Weak Data in Regression. New York, NY: Wiley.

* Belsley, D. A. (1991b, February). A Guide to using the collinearity 
diagnostics. Computational Economics, 4(1), 33-50. Available from 
http://www.springerlink.com/content/v135h6631x412kk8/

* Cook, R. D. (1984, May). Demeaning Conditioning Diagnostics through 
Centering: Comment. The American Statistician, 38(2), 78-79.

* Echambadi, R., & Hess, J. D. (2007, May-June). Mean-Centering Does Not 
Alleviate Collinearity Problems in Moderated Multiple Regression Models. 
Marketing Science, 26(3), 438-445.

* Golub, G. H., & Van Loan, C. F. (1996). Matrix Computations (3rd ed.). 
Baltimore: Johns Hopkins University Press.

* Gunst, R. F. (1984, May). Comment: Toward a Balanced Assessment of 
Collinearity Diagnostics. The American Statistician, 38(2), 79-82.

* Hill, R. C., & Adkins, L. C. (2001). Collinearity. In B. H. Baltagi 
(Ed.), A Companion to Theoretical Econometrics (p. 256-278). Oxford: 
Blackwell.

* Marquardt, D. W. (1987). Collinearity and Least Squares Regression: 
Comment. Statistical Science, 2(1), 84-85. Available from 
http://projecteuclid.org/euclid.ss/1177013440

* Montgomery, D. C., & Peck, E. A. (1982). Introduction to Linear 
Regression Analysis. New York, NY: John Wiley.

* Simon, S. D., & Lesage, J. P. (1988, January). The impact of 
collinearity involving the intercept term on the numerical accuracy of 
regression. Computational Economics (formerly Computer Science in 
Economics and Management), 1(2), 137-152.

* Snee, R. D., & Marquardt, D. W. (1984, May). Comment: Collinearity 
Diagnostics Depend on the Domain of Prediction, the Model, and the Data. 
The American Statistician, 38(2), 83-87.

* Weisberg, S. (1980). Applied Linear Regression. New York, NY: John Wiley.

* Wood, F. S. (1984, May). Comment: Effect of Centering on Collinearity 
and Interpretation of the Constant. The American Statistician, 38(2), 88-90.

Christina Bogner schrieb:
> Dear list members,
> 
> I'm working with both nlme and lme4 packages trying to fit linear 
> mixed-models to soil chemical and physical data. I know that for
> linear models one can calculate the condition number kappa of the
> model matrix to know whether the problem is well- or ill-conditioned.
> Does it make any sense to compute kappa on the design matrix of the
> fixed-effects in nlme or lme4? For comparison I fitted a simple
> linear model to my data and scaling some numerical predictors
> decreased kappa considerably. So I wonder if scaling them in the
> mixed-model has any advantages?
> 
> Thanks a lot for your help.
> 
> Christina Bogner
>