[R] Still help needed on embeded regression
Gabor Grothendieck
ggrothendieck at gmail.com
Tue May 2 15:37:55 CEST 2006
Try
runmean2 <- function(x, k) # k must be even
(coredata(runmean(x, k-1)) * (k-1) +
coredata(lag(x, -k/2, na.pad = TRUE)))/k
Also, in your code use matrices or vectors instead of data frames
to avoid any overhead in using data frames.
On 5/2/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:
> Sorry to bother you guys again. This is great. But
> this is for 61 number and the second case will change
> 60 to 61. "run*" only accept odd number window. How
> to get around it with 60? Any suggestion? Thanks.
>
> --- Gabor Grothendieck <ggrothendieck at gmail.com>
> wrote:
>
> > Using runmean from caTools the first one below does
> > it in under 1 second but will not handle NAs. The
> > second one takes under 15 seconds and handles
> > them by replacing them with linear approximations.
> > Note that k must be odd.
> >
> > # 1
> >
> > library(caTools)
> > set.seed(1)
> > system.time({
> > y <- rnorm(140001)
> > x <- as.numeric(seq(y))
> > k <- 61
> > Mxy <- runmean(x * y, k)
> > Mxx <- runmean(x * x, k)
> > Mx <- runmean(x, k)
> > My <- runmean(y, k)
> > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx)
> > a <- My - b * Mx
> > })
> >
> > # 2
> >
> > library(caTools)
> > library(zoo)
> > set.seed(1)
> > system.time({
> > y <- rnorm(140000)
> > x <- as.numeric(seq(y))
> > x[100:200] <- NA
> > x <- na.approx(zoo(x))
> > y <- zoo(y)
> > k <- 60
> > Mxy <- runmean(x * y, k)
> > Mxx <- runmean(x * x, k)
> > Mx <- runmean(x, k)
> > My <- runmean(y, k)
> > b <- (Mxy - Mx * My) / (Mxx - Mx * Mx)
> > a <- My - b * Mx
> > })
> >
> >
> > On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:
> > > I basically has a long data.frame a. but I only
> > need
> > > three columns x,y. Let us say the index of row is
> > t.
> > > I need to produce new column s_t as the linear
> > > regression coefficient of (x_(t-60),...x_(t-1)) on
> > > (y_(t-60),...,y_(t-1)). The data is about 140,000
> > > rows. I wrote a simple code on this which is
> > super
> > > slow, it takes more than 2 hours on a 2.8Ghz Intel
> > Duo
> > > Core. My friend use SAS and his code needs only
> > > couple of minutes. I know there must be some more
> > > efficient way to write it. Can anyone help me on
> > > this? Here is the code.
> > >
> > > Also one line produce a complete NA temp$y and lm
> > > function failed on that. How to make it just
> > produce
> > > a NA instead and keep runing?
> > >
> > > attach(return)
> > > betat=rep(NA,length(RET))
> > > for (i in 61:length(RET)){cat(i," ");
> > > if (year[[i]]>=1995){
> > >
> > >
> >
> temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])
> > >
> > >
> >
> betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]
> > > #if (i%%100==0)
> > > cat(i," ");
> > >
> > >
> > >
> >
> return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]
> > > }
> > > }
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > >
> >
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
More information about the R-help
mailing list