[R] Help with Projection Pursuit, ppr().
Prof Brian D Ripley
ripley at stats.ox.ac.uk
Sat Sep 2 08:25:54 CEST 2000
On Fri, 1 Sep 2000 gbrajovic at entelchile.net wrote:
> Recently, I installed the 1.1.0 version of R (for Windows), since it
includes an implementation of Projection Pursuit (I failed to write
my own version of PP as a standalone C++ program).
Then you might want to read the R sources to see how it was done.
> As far as I know, R offers two interfaces/sintax for the ppr() function. The first one requieres a regression formula and a data frame. The other requieres X, a matrix with the explanatory variables, and Y, wich I presume is a matrix containing the responses (documentation for ppr doesn't explain what kind of object is Y).
> My problems are:
> a)I can't use the "non regression formula" ppr(), since it returns an error condition I cannot understand. This is what i did/got:
> > library(modreg)
> > X <- matrix(scan("miX_e.txt", 0), ncol=2, byrow=TRUE)
> Read 450 items
> > Y <- matrix(scan("g1_sr_e.txt", 0), ncol=1, byrow=TRUE)
> Read 225 items
> > g1.ppr <-ppr(X,Y, nterms=3, max.terms=5)
> Error in matrix(smod[q + 6 + p * ml + 1:(q * mu)], q, mu, dimnames = list(ynames, :
> length of dimnames not equal to array extent
You need to start debugging. There's an error in the line
else ynames <- paste("Y", 1:p, sep = "")
which should be
else ynames <- paste("Y", 1:q, sep = "")
but please give your matrices some dimnames (or at least column names) to
get interpretable results. (That this bug has not emerged before
suggests that users almost inevitably do.)
(It's very easy to debug(ppr.default) and find out what the variables
are and hence what is wrong.)
> Where "miX_e.txt" is a file containing 225 rows (225 cases for training)
with 2 columns (2 explanatory variables), and "g1_sr_e.txt" is a
file with one column (1 response) and 225 rows
(one for each of the 225 trining cases).
I presume the problem is related to Y, since its use is not
explained in the ppr documentation, and I'm guessing its a
matrix with the responses.
It is easy to read R code if things are missing from the help pages.
> Any help about this? (By the way, the test data is the noiseless g1 function published by Hwang).
Using noiseless functions as test cases is *not* a good idea. That's
not what the optimization algorithms are tuned for, not is it a
`realistic' test problem.
> b)I can use the "regression formula" ppr() succesfuly, but I would like a
non parametric approach (I guess that the need of a regression formula
makes it "parametric"). How does ppr() use this regression formula?
Is there a way to avoid the use of the regression formula?
Is there a "non parametric" regression formula?
That makes no sense to me. The formula just specifies X, as formulae do in
the S/R languange. There is nothing more parametric about this than ppr.default.
I suspect you are jumping in too deep, and need to learn how to do simpler
things (like linear regression) in R first. It's a `regression formula'
as distinct from a `Trellis formula' or a `coplot formula'.
> Any help will be MOST welcome, since I'm running out of time (i need to perform some evaluations with PP before the end of september).
ppr is an R port of a function in the S MASS library and documented in more
detail with examples in Venables & Ripley (1999) `Modern Applied Statistics
with S-PLUS'. I suggest you consult that.
> Thanks in advance,
> Guillermo Brajovic A.
> gbrajovic at entelchile.net
> Universidad de Santiago,
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
More information about the R-help