# [R] qr and Moore-Penrose

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jun 30 14:15:08 CEST 1999

```> Date: Wed, 30 Jun 1999 11:12:24 +0200 (MET DST)
> From: Torsten Hothorn <hothorn at amadeus.statistik.uni-dortmund.de>
>
> yesterday I had a little shock using qr (or lm). having a matrix
>
> X <- cbind(1,diag(3))
> y <- 1:3
>
> the qr.coef returns one NA (because X is singular). So I computed the
> Moore-Penrose inverse of X (just from the definition) and I get a correct
> result. Whats wrong about qr in this situation?

What is a correct result, by the way?  There are infinitely many solutions
for the regression of y on X, and the Moore-Penrose one is just one choice
(that assumes that the coefficients are somehow comparable).

> X <- cbind(1, diag(3));	# singular matrix
> y <- 1:3
> Xp <- qr(X);
> b1 <- qr.coef(Xp, y);	# contains NA
> b2 <- mpinv(X)%*%y	# least square fit using Moore-Penrose
> X%*%b2			# == y

Those `;' are unnecessary: either CR or ; separates expressions in
S-like languages.

I find
> drop(b2)
[1]  1.5 -0.5  0.5  1.5
> b1
[1]  3 -2 -1 NA
> lm(y ~ X + 0)

Call:
lm(formula = y ~ X + 0)

Coefficients:
X1  X2  X3  X4
3  -2  -1  NA
> lm(y ~ X)

Call:
lm(formula = y ~ X)

Coefficients:
(Intercept)           X1           X2           X3           X4
2.5           NA         -1.5           NA           NA
[sic]

whereas S gives

> qr.coef(qr(X), y)
[1]  3 -2 -1  0
> lm(y ~ X, singular.ok=T)
Call:
lm(formula = y ~ X, singular.ok = T)

Coefficients: (2 not defined because of singularities)
(Intercept) X2 X3
3 -2 -1

Now, I can see the problem with lm under R, but what is wrong with qr.coef?

#--------

In R:

> qr(cbind(1,1,diag(3)))
\$qr
[,1]       [,2]      [,3]       [,4]       [,5]
[1,] -1.7320508 -0.5773503 -1.732051 -0.5773503 -0.5773503
[2,]  0.5773503  0.8164966  0.000000 -0.4082483 -0.4082483
[3,]  0.5773503  0.7071068  0.000000 -0.7071068  0.7071068

\$rank
[1] 2

\$qraux
[1] 1.5773503 1.7071068 0.0000000 0.7071068 0.7071068

\$pivot
[1] 1 3 2 4 5

It seems that there is a problem here (and S gets this right). I think
the changes to the Linpack pivoting strategy in dqrdc2 fail in this
example.

--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

```