[R] loess crash
Rafael A. Irizarry
ririzarr at jhsph.edu
Mon Sep 16 16:17:23 CEST 2002
i would suggest looking at the package mgcv.
you can fit generalized additive models which are useful for what
you desribe below.
On Mon, 16 Sep 2002, John Deke wrote:
> Ah... I hadn't noticed that option! Thanks... that's a good idea. I'm quite
> happy to use local linear regression.
>
> To answer your question -- perhaps I'm off base, but my reason for wanting
> to do this is that I have a set of explanatory variables that most likely
> influence my dependent variable in ways that are difficult to model
> parametrically. That is, I suspect that there are all sorts of complementary
> relationships between these variables, and its not at all clear that there's
> a satisfying theoretical model that would suggest a clear-cut parametric
> relationship. So, rather than using parametric regression, I'd like to try
> something non-parametric.
>
> My plan for summarizing the results is to find the average marginal effect
> of each explanatory variable of interest, holding all else constant. Also, I
> would calculate predicted outcomes for combinations of the explanatory
> variables that are most likely to occur in "the real world".
>
> John
>
> -----Original Message-----
> From: John Fox [mailto:jfox at mcmaster.ca]
> Sent: Monday, September 16, 2002 9:31 AM
> To: John Deke
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] loess crash
>
>
> Dear John,
>
> For curiosity, I tried your example under R 1.5.1 on an 800 MHz PC with 512
> Mb of memory running Windows 2000. The results were just as you described:
> The four-predictor problem ran essentially instantly, and the
> five-predictor problem crashed R, again instantly.
>
> I also tried making the problem less computationally demanding by
> specifying locally linear, rather than quadratic, fits; this appears to
> work:
>
> > loess(y~x1+x2+x3+x4+x5, data2, degree=1)
> Call:
> loess(formula = y ~ x1 + x2 + x3 + x4 + x5, data = data2, degree = 1)
>
> Number of Observations: 500
> Equivalent Number of Parameters: 13.5
> Residual Standard Error: 1.012
> >
>
>
> Although something is obviously wrong here, I wonder whether it makes sense
> to fit a local regression with so many predictors (unless the object is to
> compare the general nonparametric fit with some more constrained model):
> how would you describe the five-dimensional surface that's produced?
>
> John
>
> At 07:36 AM 9/16/2002 -0400, John Deke wrote:
> >Here's a simple example that yields the crash:
> >
> >library(modreg)
> >data1 <- array(runif(500*5),c(500,5))
> >colnames(data1) <- c("x1","x2","x3","x4","x5")
> >y <-
> >3+2*data1[,"x1"]+15*data1[,"x2"]+13*data1[,"x3"]-8*data1[,"x4"]+14*data1[,"
> x5"]+rnorm(500)
> >data2 <- cbind(y,data1)
> >data2 <- as.data.frame(data2)
> >result1 <- loess(y~x1+x2+x3+x4,data2)
> >
> >To get the crash, I just add x5--
> >
> >result1 <- loess(y~x1+x2+x3+x4+x5,data2)
> >
> >And bammo -- I'm dead. It doesn't even pause -- Rgui crashes, and I mean
> >really crashes -- the program is terminated, I get the little Windows
> >dialogue saying that a log file is being generated -- the whole dramatic
> >death scene.
> >
> >I know its a computationally intensive thing, but the one that doesn't
> >crash (with four explanatory variables) runs almost instantly. Its hard to
> >see how adding a fifth could be so catastrophic. But I am somewhat new to
> >this particular methodology....
> >
> >John
> >
> >At 03:38 AM 9/16/2002, Peter Dalgaard BSA wrote:
> >>John Deke <jdeke2 at comcast.net> writes:
> >>
> >> > Hmm... if I reduce the number of observations to just 500, I still get
> >> > the error.
> >> >
> >> > I don't think its an issue of colinearity, because I've tried several
> >> > different combinations of variables, all of which work just fine in an
> >> > OLS or logistic regression.
> >> >
> >> > I'm probably doing something stupid, but I'm not seeing it...
> >> >
> >> > At 02:00 PM 9/15/2002, John Deke wrote:
> >> > >Hi,
> >> > >
> >> > > I have a data frame with 6563 observations. I can run a regression
> >> > > with loess using four explanatory variables. If I add a fifth, R
> >> > > crashes. There are no missings in the data, and if I run a
> >> > > regression with any four of the five explanatory variables, it
> >> > > works. Its only when I go from four to five that it crashes.
> >>
> >>Hmm... I wouldn't try loess with more than one or two descriptors. I
> >>mean, it's a smoothing method and representing a smooth function of
> >>many variables can be computationally demanding.
> >>
> >>The Fortran source code for loess is one of the more obfuscated pieces
> >>of R, but I can see that some structures inside of it are of fixed
> >>size, which might explain it (BTW: Does R really crash, or just say
> >>memory exhausted?).
> >>
> >>Do you have a simple example that reproduces the crash (using random
> >>numbers, e.g.)?
>
> -----------------------------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada L8S 4M4
> email: jfox at mcmaster.ca
> phone: 905-525-9140x23604
> web: www.socsci.mcmaster.ca/jfox
> -----------------------------------------------------
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list