[R] Preparing data for display
Stavros Macrakis
macrakis at alum.mit.edu
Mon Nov 10 22:28:38 CET 2008
I have a dataset of about 10^6 rows, each consisting of a timestamp,
several factors, a string, some integers, and some floats.
I'd like to graph this data in various ways, including straightforward
ones (how many events per week over the past year for each of 4 values
of some factor), some less straightforward. I've managed to do this
by brute force, but I'd like to learn how to do it in more elegant,
more R-like code.
Consider for example the following, which graphs the 25th, 50th, and
75th percentile values per day of data$x
perc <- function(code,data)
{ # select the part of the data with factor value
slice <- data[data$factor == code,];
# calc quartiles for each day
quarts <- tapply(slice$x,
slice$day,
function(x) quantile(x,c(.25,.50,.75)));
# returns a tagged list of tagged vectors
# list("2008-10-07" = c("25%" = .05, "50%" = .47,
... ) , ...)
# convert to a data frame -- is there some mapping function to do this?
fr <- data.frame( day = to.time(names(quarts)), # strings
back to dates (!)
"25%" = sapply(quarts, function(x)
x[[1]] ), # !!
"50%" = sapply(quarts, function(x) x[[2]] ),
"75%" = sapply(quarts, function(x) x[[3]] ) );
# columns are now labelled "X25." etc. (!)
for (i in 2:4) { plot( fr$day, res[[2]], type="l", ylim= c( 0,
max(pmax(fr[[1]],fr[[2]],fr[[3]] )) ));
par(new=TRUE); }
par(new=FALSE);
}
This works, but is pretty ugly in a variety of ways. What is the
right way to do this?
Thanks,
-s
More information about the R-help
mailing list