[R] highest eigenvalues of a matrix

Thu Jun 19 19:05:57 CEST 2008

>>>>> "KateM" == Katharine Mullen <kate at few.vu.nl>
>>>>>     on Thu, 19 Jun 2008 11:50:22 +0200 (CEST) writes:

    KateM> On Thu, 19 Jun 2008, Simon Wood wrote:
    >> >
    >> > I happily use eigen() to compute the eigenvalues and
    >> eigenvectors of > a fairly large matrix (200x200, say),
    >> but it seems over-killed as its > rank is limited to
    >> typically 2 or 3. I sort of remember being taught > that
    >> numerical techniques can find iteratively decreasing
    >> eigenvalues > and corresponding orthogonal eigenvectors,
    >> which would provide a nice > alternative (once I have the
    >> first 3, say, I stop the search).
    >> 
    >> Lanczos iteration will do this efficiently (see
    >> e.g. Golub & van Loan "Matrix Computations"), but I don't
    >> think that there are any such routines built into R or
    >> LAPACK (although I haven't checked the latest LAPACK
    >> release). When I looked it seemed that the LAPACK options
    >> that allow you to select eigen-values/vectors still
    >> depend on an initial O(n^3) decomposition of the matrix,
    >> rather than the O(n^2) that a Lanczos based method would
    >> require.
    >> 

     KateM> ARPACK (http://www.caam.rice.edu/software/ARPACK/) uses a
     KateM> Lanczos method for symmetric matrics; otherwise it uses an
     KateM> Arnoldi iteration.  Development of an R interface to ARPACK
     KateM> would be a nice project (but unfortunately one I don't have
     KateM> time for for a while).  Maybe one of the maintainers of a
     KateM> package for sparse matrices would be interested.

Yes, thank you, Kate, we have been.
The nice thing in ARPACK is its concept of "reverse
communication", such that  we should be able to use it both 
for sparse and for dense matrices.
We would only have to provide a "function" for computing
"A %*% x" and hence could use different ones, depending on the
class of A.... all in theory at least.

One problem I see with the  with that code is that its README file contains

  *** NOTE ***  Unless the LAPACK library on your system is version 2.0,
   we strongly recommend that you install the LAPACK routines provided with
   ARPACK. Note that the current LAPACK release is version 3.0; if you are 
   not sure which version of LAPACK is installed, pleaase compile and link 
   to the subset of LAPACK included with ARPACK. 

i.e. you need to use old/outdated Lapack with ARPACK instead of
the current Lapack .  Something that sheds a bad light to ARPACK
in my view.

In the mean time (I've suffered from a computer crash and other
interruptions) I see Gabor has already mentioned the following

  Then, we have seen that the 'igraph' package als uses a port of 
  (part of) ARPACK as part of the C++ 'igraphlib' library.

    >> My `mgcv' package (see cran) uses Lanczos iteration for
    >> setting up low rank bases for smoothing. The source code
    >> is in mgcv/src/matrix.c:lanczos_spd, but I'm afraid that
    >> there is no direct R interface, although it would not be
    >> too hard to write a suitable wrapper. It requires the
    >> matrix to be symmetric.
    >> 
    >> > Looking at the R source code for eigen and some posts
    >> on this list, > it seems that the function uses a LAPACK
    >> routine, but obviously all > the options are not
    >> available through the R wrapper. I'm not > experienced
    >> enough to try and make my own interface with Fortran >
    >> code, so here are two questions:
    >> >
    >> > - is this option (choosing a desired number of
    >> eigenvectors) already > implemented in some function /
    >> package that I missed?  --- In the symmetric case you can
    >> use `svd' which lets you select (although you'd need to
    >> fix up the signs of the singular values to get
    >> eigen-values if the matrix is not +ve definite). But this
    >> answer is pretty useless as it will be slower than using
    >> `eigen' and getting the full decomposition.
    >> 
    >> Of course if you know that your matrix is low rank
    >> because it's a product of non-square matrices then
    >> there's usually some way of getting at the
    >> eigen-decomposition efficiently. E.g. if A=B'B where B is
    >> 3 by 1000, then the cost can easily be kept down to
    >> O(1000^2) in R...
    >> 
    >> best, Simon
    >> 
    >> > - is the "range of indices" option in DSYEVR.f <
    >> http:// > www.netlib.org/lapack/double/dsyevr.f > what I
    >> think, the indices of > the desired eigenvalues ordered
    >> from the highest to lowest?
    >> >

    KateM> dsyevr.f will work with symmetric real matrices only.  When the range
    KateM> argument of dysevr is set to 'I', arguments il and iu
    KateM> seem to specify the range of eigenvalue indices you
    KateM> want in ascending order (lowest to highest, not
    KateM> highest to lowest).  If you look at
    KateM> https://svn.r-project.org/R/trunk/src/modules/lapack/Lapack.c
    KateM> you see that range is always set to 'A'.

yes, but do you agree that using 'I' (and a range) does not
really help much, since the factorization used is the "full" one
anyway ?

    >> > Many thanks in advance for any piece of advice,

    >> > Sincerely,
    >> >
    >> > Baptiste
    >> >
    >> > dummy example if needed:
    >> >
    >> > test <- matrix(c(1, 2, 0, 4, 5, 6, 1.00001, 2, 0),
    >> ncol=3) > eigen(test)
    >> >
    >> >
    >> >
    >> >
    >> > _____________________________
    >> >
    >> > Baptiste AuguiÃ©
    >> >
    >> > Physics Department > University of Exeter > Stocker
    >> Road, > Exeter, Devon, > EX4 4QL, UK
    >> >
    >> > Phone: +44 1392 264187
    >> >
    >> > http://newton.ex.ac.uk/research/emag >
    >> http://projects.ex.ac.uk/atto
    [............]

    >> --
    >> > Simon Wood, Mathematical Sciences, University of Bath,
    >> Bath, BA2 7AY UK > +44 1225 386603
    >> www.maths.bath.ac.uk/~sw283