[R] importing and filtering time series data
jim holtman
jholtman at gmail.com
Mon May 2 01:39:20 CEST 2011
Here is one approach. It would be good to provide a reasonable sample of data:
> x <- unclass(Sys.time()) # today's date
> # create some data
> # increments by ~ 0.1 seconds
> len <- cumsum(runif(100, 0, 0.1))
> dataFile <- data.frame(time = x + len,
+ flag = sample(c("Y", "N"), 100, TRUE),
+ dur = runif(100, 10,1000)
+ )
> write.csv(dataFile, file = 'myData.csv', row.names = FALSE)
>
> # read the data and summarize by 1 second intervals
> input <- read.csv('myData.csv')
> # remove "N"
> input <- subset(input, flag == "N")
> require(data.table) # I like this for creating summaries
> input <- data.table(input)
> # add column for summary
> input$key <- factor(trunc(input$time))
> input[,
+ list(count = length(time)
+ , latency = mean(dur)
+ , var = var(dur)
+ , '5%' = quantile(dur, prob = 0.05)
+ , '95%' = quantile(dur, prob = 0.95)
+ )
+ , by = key
+ ]
key count latency var X5. X95.
[1,] 1304293090 6 558.3471 73765.28 255.09390 872.3692
[2,] 1304293091 8 580.4440 103743.05 132.39461 963.2297
[3,] 1304293092 10 494.1759 62945.55 150.89719 869.8083
[4,] 1304293093 10 557.1942 105834.81 102.53878 941.1442
[5,] 1304293094 17 477.2077 106452.72 35.15032 947.0750
>
>
On Fri, Apr 29, 2011 at 11:27 AM, Joel Reymont <joelr1 at gmail.com> wrote:
> Folks,
>
> I'm new to R and would like to use it to analyze web server performance data.
>
> I collect the data in this CSV format:
>
> 1304083104.41,Y,668.856249809
> 1304083104.41,Y,348.143193007
>
> First column is a <seconds.microseconds> timestamp, rows with N instead of Y need to be skipped and the last column has the same format as the first column, except it's request duration (latency).
>
> I would like to calculate average number of requests per second, mean latency, variance, 5 and 95 percentiles.
>
> What is the best way to accomplish this, starting with importing of time series?
>
> Thanks, Joel
>
> --------------------------------------------------------------------------
> - for hire: mac osx device driver ninja, kernel extensions and usb drivers
> ---------------------+------------+---------------------------------------
> http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
> ---------------------+------------+---------------------------------------
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
More information about the R-help
mailing list