[Rd] NaN and linear algebra

Bill Northcott w.northcott at unsw.edu.au
Thu Mar 24 12:59:16 CET 2005


On 24/03/2005, at 10:04 PM, Andy wrote:
> I just tested this problem on a Windows 2000 Pro P4-Xeon system.  I 
> tried
> this in patched 2.0.1 (2005-2-28) and 2.1.0-alpha (2005-3-23) with both
> generic BLAS and precompiled ATLAS BLAS dll (downloaded from CRAN 
> windows
> binaries).  In all 4 combinations I get
>
>> d<-matrix(NaN,3,3)
>> f<-solve(d)
>> f
>      [,1] [,2] [,3]
> [1,]  NaN  NaN  NaN
> [2,]  NaN  NaN  NaN
> [3,]  NaN  NaN  NaN
>> det(d)
> [1] NaN
>>
>>
I just want to clarify some of the discussion here for my own benefit.

As I understand it, there are two software libraries involved (three if 
you count R).  BLAS is the lowest layer providing basic matrix 
operations.  It comes in vanilla and platform optimised (ATLAS or Goto) 
flavours.  Presumably R's built in version is vanilla.

Relying on BLAS is LAPACK which provides the functions which R actually 
calls like dgesv.  LAPACK relies on BLAS to provide the platform 
optimisations.  As far as I can see the issue(s) here is(are) with 
LAPACK not BLAS.  It is LAPACK that issues the error and halts the 
program.  (The MacOS X Accelerate Framework contains BLAS and LAPACK as 
distinct libraries libBlas.dylib and libLapack.dylib)

If, as reported, the behaviour is fundamentally similar on MacOS X, 
Linux and IRIX then it would appear that this is a feature(bug?) of the 
LAPACK reference implementation and that whoever wrote the Windows 
version saw fit to improve on it.  I don't think LAPACK ever pretended 
to properly handle NaNs.

It seems to me that the Windows version is an improvement, because it 
complies with the robust arithmetic principles that programs should not 
halt but propagate NaN or other indicators that can tested as 
appropriate in the application.  So it might be good idea to press 
other implementers of LAPACK to do the same.

My view for what it is worth is that R should also be robust in this 
sense and not halt on NaNs.  For preference this should be handled in 
the LAPACK implementations, but if not R should should trap them.  As I 
see it, it is incorrect to say a matrix containing NaNs is singular, 
not positive definite or whatever, because since it contains NaNs the 
computations to make those determinations cannot be performed.

Is this the wrong way to look at it?

Bill Northcott

PS  This discussion has given me another thought.  On MacOS X, it would 
be much faster to use floats instead of doubles because they can be 
directly processed by the Altivec section of the CPU.  I can see no way 
to have R use floats.  Would that be possible as a future enhancement 
or is it just too hard?



More information about the R-devel mailing list