[R] Selecting ranges of dates from a dataframe
Benjamin Stier
benjamin.stier at ub.uni-tuebingen.de
Thu Mar 10 14:23:48 CET 2011
Hello list!
I have a data.frame which looks like this:
> serv
datum op.read op.write read write
1 2011-01-29 10:00:00 0 0 0 0
2 2011-01-29 10:00:01 0 0 0 0
3 2011-01-29 10:00:02 0 0 0 0
4 2011-01-29 10:00:03 0 4 0 647168
5 2011-01-29 10:00:04 0 0 0 0
6 2011-01-29 10:00:05 0 14 0 1960837
7 2011-01-29 10:00:06 0 0 0 0
...
115 2011-01-30 10:00:54 0 0 0 0
116 2011-01-30 10:00:55 0 0 0 0
117 2011-01-30 10:00:56 0 0 0 0
118 2011-01-30 10:00:57 54 0 29184 0
119 2011-01-30 10:00:58 204 0 122880 0
120 2011-01-30 10:00:59 0 0 0 0
...
I want to compare read/write from each day. I already have a solution, but it
is pretty slow.
# read the data
serv <- read.delim("cut.inp")
# Reformat the dates from the file
serv$datum <- strptime(serv$datum, "%Y-%m-%d %H:%M:%S")
# select all single days
dates.serv <- unique(strptime(serv$datum, format="%Y-%m-%d"))
# create a data.frame
values <- data.frame(row.names=1, datum=numeric(0), write=numeric(0), read=numeric(0))
for(i in as.character(dates.serv)) {
# build up a values for a day-range
searchstart <- as.POSIXlt(paste(i, "00:00:00", sep=" "))
searchend <- as.POSIXlt(paste(i, "23:59:59", sep=" "))
# select all values from a specific day
day <- serv[(serv$datum >= searchstart & serv$datum <= searchend),]
write <- as.numeric(sum(as.numeric(day$write)))
read <- as.numeric(sum(as.numeric(day$read)))
# add to the data.frame
values <- rbind(values, data.frame(datum=i, write=write, read=read))
}
This is my first try using R for statistics so I'm sure this isn't the best
solution.
The for-loop does it's job, but as I said is really slow. My data is for 21
days and 1 line per second.
Is there a better way to select the date-ranges instead of a for-loop? The
line where I select all values for "day" seems to be the heaviest. Any idea?
Kind regards,
Benjamin
PS: I attached some sample data, in case you want to try for yourself.
-------------- next part --------------
datum op.read op.write read write
2011-01-29 10:00:00 0 0 0 0
2011-01-29 10:00:01 0 0 0 0
2011-01-29 10:00:02 0 0 0 0
2011-01-29 10:00:03 0 4 0 647168
2011-01-29 10:00:04 0 0 0 0
2011-01-29 10:00:05 0 14 0 1960837
2011-01-29 10:00:06 0 0 0 0
2011-01-29 10:00:07 0 611 0 3533701
2011-01-29 10:00:08 1 0 9728 0
2011-01-29 10:00:09 0 0 0 0
2011-01-29 10:00:10 3 0 13824 0
2011-01-29 10:00:11 1 0 1023 0
2011-01-29 10:00:12 2 1 13824 90112
2011-01-29 10:00:13 0 0 0 0
2011-01-29 10:00:14 0 0 0 0
2011-01-29 10:00:15 0 0 0 0
2011-01-29 10:00:16 0 0 0 0
2011-01-29 10:00:17 0 0 0 0
2011-01-29 10:00:18 0 0 0 0
2011-01-29 10:00:19 0 0 0 0
2011-01-29 10:00:20 0 0 0 0
2011-01-29 10:00:21 0 0 0 0
2011-01-29 10:00:22 0 0 0 0
2011-01-29 10:00:23 0 0 0 0
2011-01-29 10:00:24 0 0 0 0
2011-01-29 10:00:25 0 0 0 0
2011-01-29 10:00:26 0 0 0 0
2011-01-29 10:00:27 0 0 0 0
2011-01-29 10:00:28 0 0 0 0
2011-01-29 10:00:29 0 0 0 0
2011-01-29 10:00:30 0 0 0 0
2011-01-29 10:00:31 0 0 0 0
2011-01-29 10:00:32 0 0 0 0
2011-01-29 10:00:33 0 0 0 0
2011-01-29 10:00:34 0 0 0 0
2011-01-29 10:00:35 0 0 0 0
2011-01-29 10:00:36 0 0 0 0
2011-01-29 10:00:37 0 651 0 3397386
2011-01-29 10:00:38 0 0 0 0
2011-01-29 10:00:39 0 0 0 0
2011-01-29 10:00:40 0 0 0 0
2011-01-29 10:00:41 0 0 0 0
2011-01-29 10:00:42 0 0 0 0
2011-01-29 10:00:43 0 0 0 0
2011-01-29 10:00:44 0 0 0 0
2011-01-29 10:00:45 0 0 0 0
2011-01-29 10:00:46 0 0 0 0
2011-01-29 10:00:47 0 0 0 0
2011-01-29 10:00:48 0 0 0 0
2011-01-29 10:00:49 0 0 0 0
2011-01-29 10:00:50 0 0 0 0
2011-01-29 10:00:51 0 0 0 0
2011-01-29 10:00:52 0 0 0 0
2011-01-29 10:00:53 8 0 20480 0
2011-01-29 10:00:54 42 0 63488 0
2011-01-29 10:00:55 58 4 721920 655360
2011-01-29 10:00:56 16 3 29696 524288
2011-01-29 10:00:57 0 0 0 131072
2011-01-29 10:00:58 17 0 27648 0
2011-01-29 10:00:59 26 5 119808 786432
2011-01-30 10:00:00 0 0 0 0
2011-01-30 10:00:01 0 0 2560 0
2011-01-30 10:00:02 0 0 0 0
2011-01-30 10:00:03 0 0 0 0
2011-01-30 10:00:04 0 0 0 0
2011-01-30 10:00:05 0 0 0 0
2011-01-30 10:00:06 0 0 0 0
2011-01-30 10:00:07 0 0 0 0
2011-01-30 10:00:08 0 0 0 0
2011-01-30 10:00:09 0 0 0 0
2011-01-30 10:00:10 0 0 0 0
2011-01-30 10:00:11 0 0 0 0
2011-01-30 10:00:12 0 0 0 0
2011-01-30 10:00:13 0 433 0 1279262
2011-01-30 10:00:14 0 5 0 49152
2011-01-30 10:00:15 0 0 0 0
2011-01-30 10:00:16 0 0 0 0
2011-01-30 10:00:17 0 0 0 0
2011-01-30 10:00:18 0 0 0 0
2011-01-30 10:00:19 0 0 0 0
2011-01-30 10:00:20 0 0 0 0
2011-01-30 10:00:21 0 0 0 0
2011-01-30 10:00:22 0 0 0 0
2011-01-30 10:00:23 0 0 0 0
2011-01-30 10:00:24 0 0 0 0
2011-01-30 10:00:25 0 4 1023 327680
2011-01-30 10:00:26 10 0 36352 0
2011-01-30 10:00:27 1 0 6144 0
2011-01-30 10:00:28 21 0 52736 0
2011-01-30 10:00:29 0 0 0 0
2011-01-30 10:00:30 0 0 0 0
2011-01-30 10:00:31 0 0 0 0
2011-01-30 10:00:32 25 0 86016 0
2011-01-30 10:00:33 0 0 0 0
2011-01-30 10:00:34 0 0 0 0
2011-01-30 10:00:35 0 0 0 0
2011-01-30 10:00:36 0 0 0 0
2011-01-30 10:00:37 0 0 0 0
2011-01-30 10:00:38 0 0 0 0
2011-01-30 10:00:39 0 0 0 0
2011-01-30 10:00:40 3 0 7168 0
2011-01-30 10:00:41 0 0 0 0
2011-01-30 10:00:42 0 0 0 0
2011-01-30 10:00:43 95 204 359424 992256
2011-01-30 10:00:44 121 364 381952 1572864
2011-01-30 10:00:45 0 0 0 0
2011-01-30 10:00:46 0 0 1023 0
2011-01-30 10:00:47 0 0 0 0
2011-01-30 10:00:48 0 0 0 0
2011-01-30 10:00:49 0 0 0 0
2011-01-30 10:00:50 0 0 0 0
2011-01-30 10:00:51 0 0 0 0
2011-01-30 10:00:52 0 3 3072 413696
2011-01-30 10:00:53 0 0 0 0
2011-01-30 10:00:54 0 0 0 0
2011-01-30 10:00:55 0 0 0 0
2011-01-30 10:00:56 0 0 0 0
2011-01-30 10:00:57 54 0 29184 0
2011-01-30 10:00:58 204 0 122880 0
2011-01-30 10:00:59 0 0 0 0
More information about the R-help
mailing list