[R] non-ideal behavior in princomp

Thu Sep 28 14:16:24 CEST 2000

> From: "Ritter, Christian C SRTCL-CTGAS" <Christian.C.Ritter at OPC.shell.com>
> Date: Thu, 28 Sep 2000 12:59:25 +0200
> 
> This problem is not limited to R, but R is one of the packages in which it
> arises.

I'm sorry, but I must be missing something here. I think R (and S)
do already do what you want here.

> princomp is a nice function which creates an object for which inspection
> methods have been written.

But there is also prcomp.  (What are `inspection functions'?  Do you
mean methods for what Bill Venables sometimes calls accessor functions?)
princomp is written for more flexibility, for example to use robust
methods of extracting PCs.

> Unfortunately, princomp does not admit cases in which the x matrix is wider
> than high (i. e. more variables than observations). Such cases are typical

Really?

> in spectroscopy and related disciplines. It would be nice if the following
> two features were added to princomp:
> 
> 1. as a default behavior, princomp should digest wider than high matrices
> (it should just compute the nontrivial principal compontents). 

prcomp does exactly that, although there are problems for which the
trivial PCs are the interesting ones!

What problems are you finding with princomp?  If I give it a n x p matrix
with n < p, I get (correctly) p PCs, with (effectively) 0 standard
deviations for the last n-p.  It's possible that this does not always
work for numerical reasons, but can you describe the `problem' you
feel it has?  (Perhaps the criterion for non-negative-definiteness is
too stringent?)

> 2. as an optional behavior, princomp should only calculate (and return) the
> first A principal components. This is useful, if x is very wide and very
> short. 

Well, only min(n, p) PC's are defined, so I presume you want A < min(n, p)?
Then you just extract the first A PCs.  If A << min(n, p) there are faster
algorithms, but they are not currently implemented in R, and I can
think of very few problems where the speed difference would be noticeable.

> Some might remark that all this can be done easily by svd or by programming
> the NIPALS algorithm, but I would prefer to see it in the common version. 

It is, in prcomp which uses the svd.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._