[R] lm handling of ill-conditioned systems

john.maindonald@anu.edu.au john.maindonald at anu.edu.au
Tue Mar 21 00:08:16 CET 2000

At 09:15 AM 20/03/00 -0800, Thomas Lumley wrote:
>On 20 Mar 2000, Tomislav Goles wrote:

>> The lm() function in R seems to handle the inversion of singular X'X 
>> matrices (where there is collinearity between regression inputs) in a
>> way where one of the inputs is dropped and this also seems to be the
>> default behavior in SAS (please let me know if i'm wrong about this).
>> In some other packages (i.e. octave ols() function) the pseudo 
>> inverse is computed where singular values less then some small 
>> threshold are not included in computation of inverse but the data 
>> columns themselves are not dropped).
>> I am wandering if few experts could comment on what is a better 
>> (best?) way to handle singular matrices in this context or to perhaps 
>> point me to some literature on this.
>When the design matrix is not of full rank there is no information in the
>data about certain linear combinations of the variables.   This means that
>there isn't unique least squares or maximum likelihood estimate for the
>parameters. This isn't a numerical analysis issue, it's a statistical one.
>While it is possible to pick out a single estimate using some arbitrary
>criterion, it is not possible to construct a useful covariance matrix for
>these parameters. 

It is worth re-iterating that the generalised inverse form of solution
is an arbitrary resolution of the lack of uniqueness.
If a Moore-Penrose generalised inverse is used.  then the constraint
that is applied is that the sum of squares of the coefficients should
be as small as possible.  It would be much preferable to call this the
minimum length solution.  There are other ways to get this solution
than by forming a Moore-Penrose inverse.

Note also that there are important components of the calculation that
are uniquely defined.  There is a unique set of predicted values,
residuals, and residual sum of squares.

and residuals are uniquely 
John Maindonald               email : john.maindonald at anu.edu.au        
Statistical Consulting Unit,  phone : (6249)3998        
c/o CMA, SMS,                 fax   : (6249)5549  
John Dedman Mathematical Sciences Building
Australian National University
Canberra ACT 0200

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list