[R] Transforming simulation data which is spread across manyfiles into a barplot
Bert Gunter
gunter.berton at gene.com
Fri Jun 11 21:02:55 CEST 2010
Ouch! Lousy plot. Instead, plot the 50 (mean sent, mean received)pairs as a
y vs x scatterplot to see the relationship.
Bert Gunter
Genentech Nonclinical Biostatistics
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Hadley Wickham
Sent: Friday, June 11, 2010 11:53 AM
To: Ian Bentley
Cc: r-help at r-project.org
Subject: Re: [R] Transforming simulation data which is spread across
manyfiles into a barplot
On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley <ian.bentley at gmail.com> wrote:
> I'm an R newbie, and I'm just trying to use some of it's graphing
> capabilities, but I'm a bit stuck - basically in massaging the already
> available data into a format R likes.
>
> I have a simulation environment which produces logs, which represent a
> number of different things. I then run a python script on this data, and
> putting it in a nicer format. Essentially, the python script reduces the
> number of files by two orders of magnitude.
>
> What I'm left with, is a number of files, which each have two columns of
> data in them.
> The files look something like this:
> --1000.log--
> Sent Received
> 405.0 3832.0
> 176.0 1742.0
> 176.0 1766.0
> 176.0 1240.0
> 356.0 3396.0
> ...
>
> This file - called 1000.log - represents a data point at 1000. What I'd
like
> to do is to use a loop, to read in 50 or so of these files, and then
produce
> a stacked barplot. Ideally, the stacked barplot would have 1 bar per
file,
> and two stacks per bar. The first stack would be the mean of the sent,
and
> the second would be the mean of the received.
>
> I've used a loop to read files in R before, something like this ---
>
> for (i in 1:50){
> tmpFile <- paste(base, i*100, ".log", sep="")
> tmp <- read.table(tmpFile)
> }
>
# Load data
library(plyr)
paths <- dir(base, pattern = "\\.log", full = TRUE)
names(paths) <- basename(paths)
df <- ddply(paths, read.table)
# Compute averages:
avg <- ddply(df, ".id", summarise,
sent = mean(sent),
received = mean(received)
You can read more about plyr at http://had.co.nz/plyr.
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list