[R-sig-ME] Mixed-models and condition number
Stephan Kolassa
Stephan.Kolassa at gmx.de
Mon Feb 2 21:20:00 CET 2009
Hi Christina,
let me start by saying that I don't know of anyone looking at
conditioning of design matrices in a mixed model environment. Might be a
nice topic to have an M. Sc. student play around with empirically. The
problem with ill-conditioning in fixed-effects models basically comes
down to high variances in the parameter estimates, so one could actually
build a mixed model with an ill-conditioned design matrix and play
around with small changes to simulated observations, checking whether
inferences or estimates exhibit "large" variance.
If you find out anything about this, would you let me know?
That said, my recent interest has been in collinearity between
predictors, which is not exactly conditioning, but reasonably close to
it. I'd recommend you look at Hill & Adkins (2001) and the collinearity
diagnostics they recommend. Belsley (1991a) wrote an entire monograph
about them, but there are also shorter introductions, e.g., Belsley (1991b).
Scaling the columns of X to equal euclidean length (usually to length 1)
before diagnosing collinearity appears to be accepted procedure, so I
think scaling would be a good starting point in the mixed model, too.
However, there is a discussion as to whether to first remove the
constant column from X and subtract the column mean from each of the
remaining columns.
Marquardt (1980) claims that centering removes "nonessential ill
conditioning." Weisberg (1980) and Montgomery and Peck (1982) also
advocate centering.
Other practitioners maintain that centering removes meaningful
information from X, such as collinearity with the constant column, and
should not be used (Belsey et al., 1980; Belsley, 1984a, 1984b, 1986,
1991a, 1991b; Echambadi & Hess, 2007; Hill & Adkins, 2001). For example,
Simon and Lesage (1988) found that collinearity with the constant
column introduces numerical instability, which is mitigated but not
prevented by employing collinearity diagnostics after centering X. In
addition, these problems are not confined to the constant coefficient,
but extend to all estimates.
For a very lively debate on this topic see Belsley (1984a); Cook (1984);
Gunst (1984); Snee and Marquardt (1984); Wood (1984); Belsley (1984b).
The consensus seems to be that centering cannot be once and for all be
advised or rejected; rather, whether or not to center data depends on
the problem one is facing.
HTH,
Stephan
* Belsey, D. A., Kuh, E., & Welsch, R. E. (1980). Regression
Diagnostics: Identifying Influential Data and Sources of Collinearity.
New York, NY: John Wiley & Sons.
* Belsley, D. A. (1984a, May). Demeaning Conditioning Diagnostics
through Centering. The American Statistician, 38(2), 73-77.
* Belsley, D. A. (1984b, May). Demeaning Conditioning Diagnostics
through Centering: Reply. The American Statistician, 38(2), 90-93.
* Belsley, D. A. (1986). Centering, the constant, first-differencing,
and assessing conditioning. In E. Kuh & D. A. Belsley (Eds.), Model
Reliability (p. 117-153). Cambridge: MIT Press.
* Belsley, D. A. (1987). Collinearity and Least Squares Regression:
Comment -- Well-Conditioned Collinearity Indices. Statistical Science,
2(1), 86-91. Available from http://projecteuclid.org/euclid.ss/1177013441
* Belsley, D. A. (1991a). Conditioning Diagnostics: Collinearity and
Weak Data in Regression. New York, NY: Wiley.
* Belsley, D. A. (1991b, February). A Guide to using the collinearity
diagnostics. Computational Economics, 4(1), 33-50. Available from
http://www.springerlink.com/content/v135h6631x412kk8/
* Cook, R. D. (1984, May). Demeaning Conditioning Diagnostics through
Centering: Comment. The American Statistician, 38(2), 78-79.
* Echambadi, R., & Hess, J. D. (2007, May-June). Mean-Centering Does Not
Alleviate Collinearity Problems in Moderated Multiple Regression Models.
Marketing Science, 26(3), 438-445.
* Golub, G. H., & Van Loan, C. F. (1996). Matrix Computations (3rd ed.).
Baltimore: Johns Hopkins University Press.
* Gunst, R. F. (1984, May). Comment: Toward a Balanced Assessment of
Collinearity Diagnostics. The American Statistician, 38(2), 79-82.
* Hill, R. C., & Adkins, L. C. (2001). Collinearity. In B. H. Baltagi
(Ed.), A Companion to Theoretical Econometrics (p. 256-278). Oxford:
Blackwell.
* Marquardt, D. W. (1987). Collinearity and Least Squares Regression:
Comment. Statistical Science, 2(1), 84-85. Available from
http://projecteuclid.org/euclid.ss/1177013440
* Montgomery, D. C., & Peck, E. A. (1982). Introduction to Linear
Regression Analysis. New York, NY: John Wiley.
* Simon, S. D., & Lesage, J. P. (1988, January). The impact of
collinearity involving the intercept term on the numerical accuracy of
regression. Computational Economics (formerly Computer Science in
Economics and Management), 1(2), 137-152.
* Snee, R. D., & Marquardt, D. W. (1984, May). Comment: Collinearity
Diagnostics Depend on the Domain of Prediction, the Model, and the Data.
The American Statistician, 38(2), 83-87.
* Weisberg, S. (1980). Applied Linear Regression. New York, NY: John Wiley.
* Wood, F. S. (1984, May). Comment: Effect of Centering on Collinearity
and Interpretation of the Constant. The American Statistician, 38(2), 88-90.
Christina Bogner schrieb:
> Dear list members,
>
> I'm working with both nlme and lme4 packages trying to fit linear
> mixed-models to soil chemical and physical data. I know that for
> linear models one can calculate the condition number kappa of the
> model matrix to know whether the problem is well- or ill-conditioned.
> Does it make any sense to compute kappa on the design matrix of the
> fixed-effects in nlme or lme4? For comparison I fitted a simple
> linear model to my data and scaling some numerical predictors
> decreased kappa considerably. So I wonder if scaling them in the
> mixed-model has any advantages?
>
> Thanks a lot for your help.
>
> Christina Bogner
>
More information about the R-sig-mixed-models
mailing list