[R] lags of a variable, with a factor
Charles Berry
ccberry at ucsd.edu
Sat Aug 24 18:54:17 CEST 2013
Jim Lemon <jim <at> bitwrit.com.au> writes:
>
> On 08/24/2013 04:16 AM, Michael Friendly wrote:
> > For sequential analysis of sequences of events, I want to calculate a
> > series of lagged
> > versions of a (numeric or character) variable. The simple function below
> > does this,
> > but I can't see how to generalize this to the case where there is also a
> > factor variable
> > and I want to calculate lags separately for each level of the factor
> > (by). Can anyone help?
> > ...
[snip]
> > >
> >
> Hi Michael,
> Maybe this will do it.
>
> lags <- function(x, k=1, prefix='lag', by) {
> if(missing(by)) {
> n <- length(x)
> res <- data.frame(lag0=x)
> for (i in 1:k) {
> res <- cbind(res, c(rep(NA, i), x[1:(n-i)]))
> }
> colnames(res) <- paste0(prefix, 0:k)
> return(res)
> }
> else {
> for(levl in levels(by)) {
> nextlags<-lags(x[by==levl,],prefix=prefix)
> rownames(nextlags)<-paste(levl,rownames(nextlags),sep=".")
> if(exist(res)) res<-rbind(res,nextlags)
> else res<-nextlags
> }
> }
> }
>
> Jim
Untested? I get
> lags(mtcars$mpg,2)
lag0 lag1 lag2
1 21.0 NA NA
2 21.0 21.0 NA
3 22.8 21.0 21.0
4 21.4 22.8 21.0
5 18.7 21.4 22.8
6 18.1 18.7 21.4
7 14.3 18.1 18.7
[ ... ]
which looks ok and
> lags(mtcars$mpg,2,by=factor(mtcars$cyl))
Error in x[by == levl, ] : incorrect number of dimensions
>
Michael, try this:
lagframe <- function(x,k=1,prefix='lag',by){
lag.one <- function(x) c(NA,head(x,-1))
indx <- if (missing(by))
lag.one(seq_along(x))
else {
spl.by <- split(seq_along(by),by)
lag.spl.by <-
lapply(spl.by, lag.one )
unsplit(lag.spl.by,by)
}
res <- setNames(data.frame(x), paste0(prefix,"0") )
for (i in 1:k) res[[ paste0(prefix,i) ]] <-
res[[ paste0(prefix,i-1) ]][ indx ]
res
}
> lags(mtcars$mpg,2)
lag0 lag1 lag2
1 21.0 NA NA
2 21.0 21.0 NA
3 22.8 21.0 21.0
4 21.4 22.8 21.0
5 18.7 21.4 22.8
[...]
> cbind( lagframe(mtcars$mpg,2,by=mtcars$cyl), cyl=mtcars$cyl)
lag0 lag1 lag2 cyl
1 21.0 NA NA 6
2 21.0 21.0 NA 6
3 22.8 NA NA 4
4 21.4 21.0 21.0 6
5 18.7 NA NA 8
6 18.1 21.4 21.0 6
7 14.3 18.7 NA 8
8 24.4 22.8 NA 4
9 22.8 24.4 22.8 4
10 19.2 18.1 21.4 6
11 17.8 19.2 18.1 6
12 16.4 14.3 18.7 8
[...]
More information about the R-help
mailing list