[R] Efficient way of creating a shifted (lagged) variable?

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Thu Aug 4 20:46:58 CEST 2011


Thanks a lot, guys!
It's really helpful. But - to be objective- it's still quite a few
lines longer than in SPSS.
Dimitri

On Thu, Aug 4, 2011 at 2:36 PM, Daniel Nordlund <djnordlund at frontier.com> wrote:
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Dimitri Liakhovitski
>> Sent: Thursday, August 04, 2011 8:24 AM
>> To: r-help
>> Subject: [R] Efficient way of creating a shifted (lagged) variable?
>>
>> Hello!
>>
>> I have a data set:
>> set.seed(123)
>> y<-data.frame(week=seq(as.Date("2010-01-03"), as.Date("2011-01-
>> 31"),by="week"))
>> y$var1<-c(1,2,3,round(rnorm(54),1))
>> y$var2<-c(10,20,30,round(rnorm(54),1))
>>
>> # All I need is to create lagged variables for var1 and var2. I looked
>> around a bit and found several ways of doing it. They all seem quite
>> complicated - while in SPSS it's just a few letters (like LAG()). Here
>> is what I've written but I wonder. It works - but maybe there is a
>> very simple way of doing it in R that I could not find?
>> I need the same for "lead" (opposite of lag).
>> Any hint is greatly appreciated!
>>
>> ### The function I created:
>> mylag <- function(x,max.lag=1){   # x has to be a 1-column data frame
>>    temp<-
>> as.data.frame(embed(c(rep(NA,max.lag),x[[1]]),max.lag+1))[2:(max.lag+1)]
>>    for(i in 1:length(temp)){
>>      names(temp)[i]<-paste(names(x),".lag",i,sep="")
>>     }
>>   return(temp)
>> }
>>
>> ### Running mylag to get my result:
>> myvars<-c("var1","var2")
>> for(i in myvars) {
>>   y<-cbind(y,mylag(y[i]),max.lag=2)
>> }
>> (y)
>>
>> --
>> Dimitri Liakhovitski
>> marketfusionanalytics.com
>>
>
> Dimitri,
>
> I would first look into the zoo package as has already been suggested.  However, if you haven't already got your solution then here are a couple of functions that might help you get started.  I won't vouch for efficiency.
>
>
> lag.fun <- function(df, x, max.lag=1) {
>  for(i in x) {
>    for(j in 1:max.lag){
>      lagx <- paste(i,'.lag',j,sep='')
>      df[,lagx] <- c(rep(NA,j),df[1:(nrow(df)-j),i])
>    }
>  }
>  df
> }
>
> lead.fun <- function(df, x, max.lead=1) {
>  for(i in x) {
>    for(j in 1:max.lead){
>      leadx <- paste(i,'.lead',j,sep='')
>      df[,leadx] <- c(df[(j+1):(nrow(df)),i],rep(NA,j))
>    }
>  }
>  df
> }
>
> y <- lag.fun(y,myvars,2)
> y <- lead.fun(y,myvars,2)
>
>
> Hope this is helpful,
>
> Dan
>
> Daniel Nordlund
> Bothell, WA USA
>
>
>



-- 
Dimitri Liakhovitski
marketfusionanalytics.com



More information about the R-help mailing list