[R] loess crash

Rafael A. Irizarry ririzarr at jhsph.edu
Mon Sep 16 16:17:23 CEST 2002


i would suggest looking at the package mgcv.
you can fit generalized additive models which are useful for what 
you desribe below. 

On Mon, 16 Sep 2002, John Deke wrote:

> Ah... I hadn't noticed that option! Thanks... that's a good idea. I'm quite
> happy to use local linear regression.
> 
> To answer your question -- perhaps I'm off base, but my reason for wanting
> to do this is that I have a set of explanatory variables that most likely
> influence my dependent variable in ways that are difficult to model
> parametrically. That is, I suspect that there are all sorts of complementary
> relationships between these variables, and its not at all clear that there's
> a satisfying theoretical model that would suggest a clear-cut parametric
> relationship. So, rather than using parametric regression, I'd like to try
> something non-parametric. 
> 
> My plan for summarizing the results is to find the average marginal effect
> of each explanatory variable of interest, holding all else constant. Also, I
> would calculate predicted outcomes for combinations of the explanatory
> variables that are most likely to occur in "the real world". 
> 
> John
> 
> -----Original Message-----
> From: John Fox [mailto:jfox at mcmaster.ca]
> Sent: Monday, September 16, 2002 9:31 AM
> To: John Deke
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] loess crash
> 
> 
> Dear John,
> 
> For curiosity, I tried your example under R 1.5.1 on an 800 MHz PC with 512 
> Mb of memory running Windows 2000. The results were just as you described: 
> The four-predictor problem ran essentially instantly, and the 
> five-predictor problem crashed R, again instantly.
> 
> I also tried making the problem less computationally demanding by 
> specifying locally linear, rather than quadratic, fits; this appears to
> work:
> 
>  > loess(y~x1+x2+x3+x4+x5, data2, degree=1)
> Call:
> loess(formula = y ~ x1 + x2 + x3 + x4 + x5, data = data2, degree = 1)
> 
> Number of Observations: 500
> Equivalent Number of Parameters: 13.5
> Residual Standard Error: 1.012
>  >
> 
> 
> Although something is obviously wrong here, I wonder whether it makes sense 
> to fit a local regression with so many predictors (unless the object is to 
> compare the general nonparametric fit with some more constrained model): 
> how would you describe the five-dimensional surface that's produced?
> 
> John
> 
> At 07:36 AM 9/16/2002 -0400, John Deke wrote:
> >Here's a simple example that yields the crash:
> >
> >library(modreg)
> >data1 <- array(runif(500*5),c(500,5))
> >colnames(data1) <- c("x1","x2","x3","x4","x5")
> >y <- 
> >3+2*data1[,"x1"]+15*data1[,"x2"]+13*data1[,"x3"]-8*data1[,"x4"]+14*data1[,"
> x5"]+rnorm(500)
> >data2 <- cbind(y,data1)
> >data2 <- as.data.frame(data2)
> >result1 <- loess(y~x1+x2+x3+x4,data2)
> >
> >To get the crash, I just add x5--
> >
> >result1 <- loess(y~x1+x2+x3+x4+x5,data2)
> >
> >And bammo -- I'm dead. It doesn't even pause -- Rgui crashes, and I mean 
> >really crashes -- the program is terminated, I get the little Windows 
> >dialogue saying that a log file is being generated -- the whole dramatic 
> >death scene.
> >
> >I know its a computationally intensive thing, but the one that doesn't 
> >crash (with four explanatory variables) runs almost instantly. Its hard to 
> >see how adding a fifth could be so catastrophic. But I am somewhat new to 
> >this particular methodology....
> >
> >John
> >
> >At 03:38 AM 9/16/2002, Peter Dalgaard BSA wrote:
> >>John Deke <jdeke2 at comcast.net> writes:
> >>
> >> > Hmm... if I reduce the number of observations to just 500, I still get
> >> > the error.
> >> >
> >> > I don't think its an issue of colinearity, because I've tried several
> >> > different combinations of variables, all of which work just fine in an
> >> > OLS or logistic regression.
> >> >
> >> > I'm probably doing something stupid, but I'm not seeing it...
> >> >
> >> > At 02:00 PM 9/15/2002, John Deke wrote:
> >> > >Hi,
> >> > >
> >> > > I have a data frame with 6563 observations. I can run a regression
> >> > > with loess using four explanatory variables. If I add a fifth, R
> >> > > crashes. There are no missings in the data, and if I run a
> >> > > regression with any four of the five explanatory variables, it
> >> > > works. Its only when I go from four to five that it crashes.
> >>
> >>Hmm... I wouldn't try loess with more than one or two descriptors. I
> >>mean, it's a smoothing method and representing a smooth function of
> >>many variables can be computationally demanding.
> >>
> >>The Fortran source code for loess is one of the more obfuscated pieces
> >>of R, but I can see that some structures inside of it are of fixed
> >>size, which might explain it (BTW: Does R really crash, or just say
> >>memory exhausted?).
> >>
> >>Do you have a simple example that reproduces the crash (using random
> >>numbers, e.g.)?
> 
> -----------------------------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada L8S 4M4
> email: jfox at mcmaster.ca
> phone: 905-525-9140x23604
> web: www.socsci.mcmaster.ca/jfox
> -----------------------------------------------------
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list