[R] locfit: max number of predictors = 6? How interpolate in 5-10D?

Keith Jewell k.jewell at campden.co.uk
Fri Feb 26 15:54:32 CET 2010


Thanks for that suggestion

I've investigated a little more using...
y <- rowSums(x) + runif(n)
... just so I had some correlation to play with.

The error I get when it fails is "Invalid what in exvval", which I don't 
understand either!
With n=5e3 it worked with 6 variables but not with 7.

I wasn't sure the error was caused by number of variables rather than 
something else, so I tried with...
n <- 100

I also tried locfit rather than locfit.raw using...
xd <- lapply(1:10, function(x) runif(n))
xd <- as.data.frame(xd)
names(xd) <- paste("x", 1:10, sep="")
y=rowSums(xd)
xd$y <- y
aF <- formula(paste("y ~ lp(",paste(names(xd)[1:6], collapse=","), ")"))
locfit(aF, xd)

Both of these gave the same results, success with 6 variables but not with 
7.

IT APPEARS, the maximum number of predictors is 6, but I don't know locfit 
well, and it may be that other settings would allow more variables.
CAN anyone give a more DEFINITIVE ANSWER?

My current data sets currently reach 5 predictors, and I expect this it 
increase.
 In S-Plus (v6.2.1) I used loess in which "Locally quadratic models may have 
at most 4 predictor variables; locally linear models may have at most 15". 
In R stats::loess allows only "one to four numeric predictors".
I'd assumed (foolishly) that because locfit didn't mention limits, the only 
limits were practical (memory, time,...) - it seems not :-(
I guess I could write something myself, I only need rough interpolation, 
even "straight line" interpolation between nearest neighbours would be OK. 
But at first glance it seems non-trivial with a substantial non-fixed number 
of dimensions (nnclust::nnfind to identify neighbours??), and I don't want 
to re-invent wheels.
Can anyone suggest an ALTERNATIVE route for INTERPOLATION in 5-10 
DIMENSIONS?

Best...
(apologies for capitals, not shouting, just highlighting key points for 
those skimming quickly)

Keith Jewell

"Liaw, Andy" <andy_liaw at merck.com> wrote in message 
news:B10BAA7D28D88B45AF82813C4A6FFA934CE647 at usctmx1157.merck.com...
> Well, I should think there's an obvious (if not elegant) way to test it:
>
> n <- 5e3
> m <- 20
> x <- matrix(runif(n * m), nrow=n)
> y <- rnorm(n)
>
> require(locfit)
> fit <- locfit.raw(x[, 1:10], y)
>
> The code above took a while on my laptop, and ended up giving some error
> I don't understand.  Not sure if the error was caused by insufficient
> sample size, or some inherent limitation.  At least it didn't choke on
> five variables.  However, if all 20 columns of x is used, locfit.raw()
> will choke because it can't compute the dimension of some variable that
> it needs to allocate memory for.
>
> I had vague recollection of reading that "5" is the limit somewhere.
> Unfortunately my copy of Local Regression and Likelihood has been MIA
> for a few years, so I can't check there.  In any case it doesn't seem
> like the number of data points and/or computing power are bigger issue.
>
> Andy
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Keith Jewell
>> Sent: Thursday, February 25, 2010 4:11 AM
>> To: r-help at stat.math.ethz.ch
>> Subject: [R] locfit: max number of predictors?
>>
>> Hi All,
>>
>> In another thread Andy Liaw, who CRAN lists as locfit
>> maintainer; said:
>> <quote>
>> From: "Liaw, Andy" <andy_liaw at merck.com>
>> To: "Guy Green" <guygreen at netvigator.com>; <r-help at r-project.org>
>> Subject: Re: Alternatives to linear regression with multiple variables
>> Date: 22 February 2010 17:50
>>
>> You can try the locfit package, which I believe can handle up to 5
>> variables.  E.g.,
>> </quote>
>>
>> Looking in the locfit documentation (e.g.
>> http://www.stats.bris.ac.uk/R/web/packages/locfit/locfit.pdf)
>> I can't see an
>> upper limit on the number of predictors; if it is 5 I'm
>> getting close in one
>> of my applications.
>>
>> Can anyone confirm or deny the existence of a 'crisp' upper
>> limit on the
>> number of predictors in locfit?
>>
>> If it is 5, or thereabouts, can anyone suggest an alternative
>> which can
>> handle a few more? (I'm using it for multidimensional interpolation).
>>
>> Best regards,
>>
>> Keith Jewell
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> Notice:  This e-mail message, together with any attachme...{{dropped:10}}
>



More information about the R-help mailing list