[R] Subset of time observations where timediff > 60 secs
Karl Ove Hufthammer
karl at huftis.org
Mon Dec 7 16:57:38 CET 2009
Dear list members
I have a rather large vector (part of a data frame) giving the time
(date + time, POSIXct) of observations. The times are irregular (with
both small and large jumps) but increasing, and there are several
millions of them.
I now wish to reduce my data set, so that I only have observations which
are at least (for example) 60 seconds apart. Basically, I need (all) the
indices of my time variable where the difference in times are at least
60 seconds.
I thought this would be a rather simple task, but perhaps I'm tired, for
I couldn't figure out how to do it in a even moderately elegant way (not
looping over all the values, which is quite slow).
This solution seemed sensible:
x=cumsum(diff(timevar) %/% 60)
ind=c(1,cumsum(rle(x)$lengths)+1) # And perhaps removing the last value
but doesn't work, as it only captures the 'first times' in each
60-second interval following the first time value, and thus may include
times with values that are closer than 60 seconds.
I also considered round.POSIXct and trunc.POSIXct, but these are not
appropriate either, for obvious reasons.
So, any ideas how to do this in an elegant and efficient way?
--
Karl Ove Hufthammer
More information about the R-help
mailing list