[Rd] Inconsistent rank in qr()
Keith O'Hara
keith.ohara at nyu.edu
Wed Jan 24 05:23:01 CET 2018
A simpler fix: return rank=NA when Lapack is used.
Keith
> On Jan 23, 2018, at 5:36 AM, Serguei Sokol <sokol at insa-toulouse.fr> wrote:
>
> Le 23/01/2018 à 08:47, Martin Maechler a écrit :
>>>>>>> Serguei Sokol <sokol at insa-toulouse.fr>
>>>>>>> on Mon, 22 Jan 2018 17:57:47 +0100 writes:
>> > Le 22/01/2018 à 17:40, Keith O'Hara a écrit :
>> >> This behavior is noted in the qr documentation, no?
>> >>
>> >> rank - the rank of x as computed by the decomposition(*): always full rank in the LAPACK case.
>> > For a me a "full rank matrix" is a matrix the rank of which is indeed min(nrow(A), ncol(A))
>> > but here the meaning of "always is full rank" is somewhat confusing. Does it mean
>> > that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
>> > May be there is a jargon where "full rank" is a synonym of min(nrow(A), ncol(A)) for any matrix
>> > but the fix to stick with commonly admitted rank definition (i.e. the number of linearly independent
>> > columns in A) is so easy. Why to discard lapack case from it (even properly documented)?
>>
>> Because 99.5% of caller to qr() never look at '$rank',
>> so why should we compute it every time qr() is called?
> 1. Because R already does it for linpack so it would be consistent to do so for lapack too.
> 2. Because R pretends that it is a part of a returned qr class.
> 3. Because its calculation is a negligible fraction of QR itself.
>
>>
>> ==> Matrix :: rankMatrix() does use "qr" as one of its several methods.
>>
>> --------------
>>
>> As wiser people than me have said (I'm paraphrasing, don't find a nice citation):
>>
>> While the rank of a matrix is a very well defined concept in
>> mathematics (theory), its practical computation on a finite
>> precision computer is much more challenging.
> True. It is indeed depending of round-off errors during QR calculations and tolerance
> setting but putting it just as min(nrow(A), ncol(A)) and still calling it rank of "full rank"
> is by far the most misleading choice to my mind.
>
> Once again, if we are already calculating it for linpack let do it in most consistent
> way for lapack too. I can propose a patch if you will.
>
> Serguei.
>
>>
>> The ?rankMatrix help page (package Matrix, part of your R)
>> https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/rankMatrix.html
>> starts with the following 'Description'
>>
>> __ Compute ‘the’ matrix rank, a well-defined functional in theory(*), somewhat ambigous in practice. We provide several methods, the default corresponding to Matlab's definition.
>>
>> __ (*) The rank of a n x m matrix A, rk(A) is the maximal number of linearly independent columns (or rows); hence rk(A) <= min(n,m).
>>
>>
>> >>> On Jan 22, 2018, at 11:21 AM, Serguei Sokol <sokol at insa-toulouse.fr> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I have noticed different rank values calculated by qr() depending on
>> >>> LAPACK parameter. When it is FALSE (default) a true rank is estimated and returned.
>> >>> Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is returned
>> >>> which is only occasionally a true rank.
>> >>>
>> >>> Would not it be more consistent to replace the rank in the latter case by something
>> >>> based on the following pseudo code ?
>> >>>
>> >>> d=abs(diag(qr))
>> >>> rank=sum(d >= d[1]*tol)
>> >>>
>> >>> Here, we rely on the fact column pivoting is activated in the called lapack routine (dgeqp3)
>> >>> and diagonal term in qr matrix are put in decreasing order (according to their absolute values).
>> >>>
>> >>> Serguei.
>> >>>
>> >>> How to reproduce:
>> >>>
>> >>> a=diag(2)
>> >>> a[2,2]=0
>> >>> qaf=qr(a, LAPACK=FALSE)
>> >>> qaf$rank # shows 1. OK it's the true rank value
>> >>> qat=qr(a, LAPACK=TRUE)
>> >>> qat$rank #shows 2. Bad, it's not the expected value.
>> >>>
>>
>> > --
>> > Serguei Sokol
>> > Ingenieur de recherche INRA
>>
>> > Cellule mathématique
>> > LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
>> > 135 Avenue de Rangueil
>> > 31077 Toulouse Cedex 04
>>
>> > tel: +33 5 6155 9849
>> > email: sokol at insa-toulouse.fr
>> > http://www.lisbp.fr
[[alternative HTML version deleted]]
More information about the R-devel
mailing list