[Rd] (PR#9623) qr.coef: permutes dimnames; inserts NA; promises

ripley at stats.ox.ac.uk ripley at stats.ox.ac.uk
Tue May 1 16:02:08 CEST 2007



On Thu, 19 Apr 2007, brech at delphioutpost.com wrote:

> Full_Name: Christian Brechbuehler
> Version: 2.4.1 Patched (2007-03-25 r40917)
> OS: Linux 2.6.15-27-adm64-xeon; Ubuntu 6.06.1 LTS
> Submission from: (NULL) (24.61.47.236)
>
>
> Splus and R have different ideas about what qr.coef(qr()) should return,
> which is fine... but I believe that R has a bug in that it is not
> internally consistent, and another separate bug in the documentation.

I agree with the bug in the dimnames, and and have corrected that in 2.5.0 
patched.  But I think the rest stems from the following misunderstanding:

> in math, zero times anything is zero, but in R, NA times
> anything (even zero) is NA.  This seems somewhat inconvenient.

That just is not true. In 'math' it would be true for the real line, but R 
is working with the computer version of the extended real line.
In 'math' 0 * Inf is not zero, and an NA value could be infinite.

Stemming from this, what R is reporting is that certain columns are not 
used in the calculation.  There is a difference between computing a 
coefficient as zero, and excluding that column from the calculation (for 
which R uses NA, because what happens if it were included is unknown).

You are assuming that you can find a linear predictor by multiplying a 
matrix by the coefficients.  That is not true in R (especially for linear 
models), and it is not said to be so in ?qr.coef.


> In particular, on this problem, Splus generates a permuted solution
> vector:
>
> Splus 6.2.1:
>    > A <- matrix(c(0,0,0, 1,1,1), nrow=3,dimnames=list(NULL, c("zero",
> "one")))
>    > A
>         zero one
>    [1,]    0   1
>    [2,]    0   1
>    [3,]    0   1
>    > y <- matrix(c(6,7,8), nrow=3, dimnames=list(NULL, "y"))
>    > y
>         y
>    [1,] 6
>    [2,] 7
>    [3,] 8
>    > x <- qr.coef(qr(A), y)
>    > x
>         y
>     one 7
>    zero 0
>    >
>
> To my taste, the answer from Splus is unexpected; I was looking for
> this:
>
>    > x[qr(A)$pivot,,drop=F]
>         y
>    zero 0
>     one 7
>
> But at least the answer is internally consistent: the values and the
> row names were scrambled in corresponding ways.
>
> However, R seems to helpfully un-permute the data values, but forgets
> to un-permute the dimnames, which seems broken.
>
> R version 2.4.1 Patched (2007-03-25 r40917)
>    > A <- matrix(c(0,0,0, 1,1,1), nrow=3,dimnames=list(NULL, c("zero",
> "one")))
>    > y <- matrix(c(6,7,8), nrow=3, dimnames=list(NULL, "y"))
>    > x <- qr.coef(qr(A), y)
>    > x
>          y
>    one  NA
>    zero  7
>
> I think the answer from qr.coef() should look more like this:
>
>    > x.wanted
>          y
>    zero NA
>    one   7
>    >
>
> In addition, R uses NA to fill in rather than zero.  NA for this
> problem above might mean "undefined": any value will do.  But given
> the application of this function, where the solution might be
> multiplied by the matrix, any NA will cause that to turn into an NA
> result... in math, zero times anything is zero, but in R, NA times
> anything (even zero) is NA.  This seems somewhat inconvenient.  So I'd really
> like
> R to return this:
>
>    > x
>          y
>    zero  0
>    one   7
>
> More on the scrambling; I see in dqrsl.f that:
>
> c               if pivoting was requested in dqrdc, the j-th
> c               component of b will be associated with column jpvt(j)
> c               of the original matrix x that was input into dqrdc.)
>
> which I think is referring to the helpful un-permuting, but I think
> qr.coef() in R needs to correspondingly un-permute the dimnames as
> well?
>
>  Code from qr.coef:
>    if (!is.null(nam <- colnames(qr$qr)))
>        rownames(coef) <- nam
>  Proposed fix: That second line could be changed to
>        rownames(coef)[qr$pivot] <- nam
>
>
> Another issue (either with the documentation or the code) is as follows.
> In R, help(qr) promises:
>
>   'solve.qr' is the method for 'solve' for 'qr' objects. 'qr.solve'
>   solves systems of equations via the QR decomposition: if 'a' is a
>   QR decomposition it is the same as 'solve.qr', but if 'a' is a
>   rectangular matrix the QR decomposition is computed first.  Either
>   will handle over- and under-determined systems, providing a
>   minimal-length solution or a least-squares fit if appropriate.
>
> So unlike Splus, R promises a minimal-length solution, but I don't
> think it delivers that.  For example, let's get the minimal-length
> solution for the following under-determined system:
>
>     x1 + 2*x2  ==  5
>
> qr.solve would fail due to the singular matrix, so use qr.coef:
>
>     > qr.coef(qr(t(1:2)),c(5))
>
>   R:
>     [1]  5 NA
>
>   Splus:
>     [1] 5 0
>
> 1*5 + 2*0 does equal 5, so modulo NA, this is a solution, but it is
> NOT minimal-length.  OTOH, svd gives the right answer in both Splus and
> R:
>
>     > d <- svd(t(1:2));  c(5) %*% d$u %*% diag(1/d$d, length(d$d)) %*% t(d$v)
>          [,1] [,2]
>     [1,]    1    2
>
> This *is* minimum length, and 1*1 + 2*2 == 5.  So either qr.coef(qr())
> should deliver that, or the documentation should clarify when a
> minimal-length solution is delivered?
>
> And revisiting the NA issue... in this problem, for the qr.coef()
> result to be a solution, that 2nd value MUST be zero, no other values
> will work.  So the NA seems wrong here, more clearly than in the other
> example.
>
> Finally, a simpler test case that shows the same issues with a 2x1
> example instead of 2x3; any fix to qr.coef() should handle this as
> well:
>
>> qr.coef(qr(matrix(0:1,1,dimnames=list(NULL, c("zero","one")))),5)
> one zero
>  NA    5
>
> I think that should return:
>
> zero  one
>    0    5
>
> or at least:
>
> zero  one
>   NA    5
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list