[R] lag a data.frame column?

Mark Knecht markknecht at gmail.com
Wed Sep 9 21:57:10 CEST 2009


On Wed, Sep 9, 2009 at 11:09 AM, Achim Zeileis
<Achim.Zeileis at wu-wien.ac.at> wrote:
> On Wed, 9 Sep 2009, Mark Knecht wrote:
>
>> Sometimes it's the simple things...
>>
>> Why doesn't this lag X$x by 3 and place it in X$x1?
>
> It does.
>
>> (i.e. - Na's in the first 3 rows and then values showing up...)
>
> Because this is not how the "ts" class handles lags.
>
> What happens is that X$x is transformed to "ts"
>  as.ts(X$x)
> which is now a regular series with frequency 1 starting at 1 and ending at
> 10. If you apply lag(), the data is not modified at all, just the time index
> is shifted
>  lag(as.ts(X$x), 3)
> Thus it does not create any NAs or - even worse - throws away observations
> (which is not necessary because the frequency time series is known and the
> time index can be extended).
>
> BTW: You almost surely wanted lag(..., -3). Personally, I also don't find
> this intuitive but it's how things are (as documented on the man page).
>
>> The help page does talk about time series. If lag doesn't work on
>> data.frame columns then what would be the right function to use to lag
>> by a variable amount?
>
> That depends what you want to do. If your data really is a time series, then
> using a time series class (such as "ts", or "zoo" etc.) would probably be
> preferable. This would probably also get you further benefits for data
> processing.
>
> If for some reason you can't do that, it shouldn't be too difficult to write
> a function that does what you want for your personal use
>  mylag <- function(x, k) c(rep(NA, k), x[1:(length(x)-k)])
> which assumes that k is a positive integer and length(x) > k.
>
> Best,
> Z
>
>

Thank you very much for the explanation. It is helpful.

I think the function is probably the best answer for me short-term as
I don't know much about time series - I need to learn - and I have a
lot of data.frames where the function can help. Over time if I learn
about ts and zoo maybe that will make my code a bit better.

Thanks,
Mark




More information about the R-help mailing list