[R] Still help needed on embeded regression

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 2 09:13:26 CEST 2006


Using runmean from caTools the first one below does
it in under 1 second but will not handle NAs.  The
second one takes under 15 seconds and handles
them by replacing them with linear approximations.
Note that k must be odd.

# 1

library(caTools)
set.seed(1)
system.time({
	y <- rnorm(140001)
	x <- as.numeric(seq(y))
	k <- 61
	Mxy <- runmean(x * y, k)
	Mxx <- runmean(x * x, k)
	Mx <- runmean(x, k)
	My <- runmean(y, k)
	b <- (Mxy - Mx * My) / (Mxx - Mx * Mx)
	a <- My - b * Mx
})

# 2

library(caTools)
library(zoo)
set.seed(1)
system.time({
	y <- rnorm(140000)
	x <- as.numeric(seq(y))
	x[100:200] <- NA
	x <- na.approx(zoo(x))
	y <- zoo(y)
	k <- 60
	Mxy <- runmean(x * y, k)
	Mxx <- runmean(x * x, k)
	Mx <- runmean(x, k)
	My <- runmean(y, k)
	b <- (Mxy - Mx * My) / (Mxx - Mx * Mx)
	a <- My - b * Mx
})


On 5/1/06, Guojun Zhu <shmilylemon at yahoo.com> wrote:
> I basically has a long data.frame a.  but I only need
> three columns x,y. Let us say the index of row is t.
> I need to produce new column s_t as the linear
> regression coefficient of (x_(t-60),...x_(t-1)) on
> (y_(t-60),...,y_(t-1)). The data is about 140,000
> rows.  I wrote a simple code on this which is super
> slow, it takes more than 2 hours on a 2.8Ghz Intel Duo
> Core.  My friend use SAS and his code needs only
> couple of minutes.  I know there must be some more
> efficient way to write it.  Can anyone help me on
> this?  Here is the code.
>
> Also one line produce a complete NA temp$y and lm
> function failed on that.  How to make it just produce
> a NA instead and keep runing?
>
> attach(return)
> betat=rep(NA,length(RET))
> for (i in 61:length(RET)){cat(i," ");
> if (year[[i]]>=1995){
>
> temp<-data.frame(y=RET[(i-60):(i-1)]-riskfree[(i-60):(i-1)],x=sprtrn[(i-60):(i-1)]-riskfree[(i-60):(i-1)])
>
> betat[[i]]<-lm(y~x+1,na.action=na.exclude,temp)[[1]][[2]]
>  #if (i%%100==0)
> cat(i," ");
>
>
> return$vol.cap[[i]]=mean(VOL[(i-12):(i-1)],na.rm=TRUE)/return$cap[[i]]
> }
> }
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list