[Rd] NaN and linear algebra
w.northcott at unsw.edu.au
Thu Mar 24 12:59:16 CET 2005
On 24/03/2005, at 10:04 PM, Andy wrote:
> I just tested this problem on a Windows 2000 Pro P4-Xeon system. I
> this in patched 2.0.1 (2005-2-28) and 2.1.0-alpha (2005-3-23) with both
> generic BLAS and precompiled ATLAS BLAS dll (downloaded from CRAN
> binaries). In all 4 combinations I get
> [,1] [,2] [,3]
> [1,] NaN NaN NaN
> [2,] NaN NaN NaN
> [3,] NaN NaN NaN
>  NaN
I just want to clarify some of the discussion here for my own benefit.
As I understand it, there are two software libraries involved (three if
you count R). BLAS is the lowest layer providing basic matrix
operations. It comes in vanilla and platform optimised (ATLAS or Goto)
flavours. Presumably R's built in version is vanilla.
Relying on BLAS is LAPACK which provides the functions which R actually
calls like dgesv. LAPACK relies on BLAS to provide the platform
optimisations. As far as I can see the issue(s) here is(are) with
LAPACK not BLAS. It is LAPACK that issues the error and halts the
program. (The MacOS X Accelerate Framework contains BLAS and LAPACK as
distinct libraries libBlas.dylib and libLapack.dylib)
If, as reported, the behaviour is fundamentally similar on MacOS X,
Linux and IRIX then it would appear that this is a feature(bug?) of the
LAPACK reference implementation and that whoever wrote the Windows
version saw fit to improve on it. I don't think LAPACK ever pretended
to properly handle NaNs.
It seems to me that the Windows version is an improvement, because it
complies with the robust arithmetic principles that programs should not
halt but propagate NaN or other indicators that can tested as
appropriate in the application. So it might be good idea to press
other implementers of LAPACK to do the same.
My view for what it is worth is that R should also be robust in this
sense and not halt on NaNs. For preference this should be handled in
the LAPACK implementations, but if not R should should trap them. As I
see it, it is incorrect to say a matrix containing NaNs is singular,
not positive definite or whatever, because since it contains NaNs the
computations to make those determinations cannot be performed.
Is this the wrong way to look at it?
PS This discussion has given me another thought. On MacOS X, it would
be much faster to use floats instead of doubles because they can be
directly processed by the Altivec section of the CPU. I can see no way
to have R use floats. Would that be possible as a future enhancement
or is it just too hard?
More information about the R-devel