[R] Faster way to zero-pad a data frame...?

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 30 22:54:27 CEST 2006


Try this:

Lines <- "time,events
 0,1
 1,30
 5,14
 10,4"

library(zoo)
data1 <- read.zoo(textConnection(Lines), header = TRUE, sep = ",")
data2 <- as.ts(data1)
data2[is.na(data2)] <- 0 # omit this lines if NAs in extra positions is ok


On 5/30/06, Pete Cap <peteoutside at yahoo.com> wrote:
> Hello List,
>
>  I am working on creating periodograms from IP network traffic logs using the Fast Fourier Transform.  The FFT requires all the data points to be evenly-spaced in the time domain (constant delta-T), so I have a step where I zero-pad the data.
>
>  Lately I've been wondering if there is a faster way to do this.  Here's what I've got:
>
>  * data1 is a data frame consisting of a timestamp, in seconds, from the beginning of the network log, and the number of network events that fell on that timestamp.
>  Example:
>  time,events
>  0,1
>  1,30
>  5,14
>  10,4
>
>  *data2 is the zero-padded data frame.  It has length equal to the greatest value of "time" in data2:
>  time,events
>  1,0
>  2,0
>  3,0
>  4,0
>  5,0
>  6,0
>  7,0
>  8,0
>  9,0
>  10,0
>
>  So I run this for loop:
>  for(i in 1:length(data1[,1])) {
>     data2[data1[i,1],2]<-data1[i,2]
>  }
>
>  Which goes to each row in data1, reads the timestamp, and writes the "events" to the corresponding row in data2.  The result is:
>  time,events
>  0,1
>  1,30
>  2,0
>  3,0
>  4,0
>  5,14
>  6,0
>  7,0
>  9,0
>  9,0
>  10,4
>
>  For a 24-hour log (86,400 seconds) this can take a while...Any advice on how to speed it up would be appreciated.
>
>  Thanks,
>  Pete Cap
>
>
> ---------------------------------
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>



More information about the R-help mailing list