[R] Transforming simulation data which is spread acrossmanyfiles into a barplot
Bert Gunter
gunter.berton at gene.com
Fri Jun 11 22:27:07 CEST 2010
So two time series? Fair enough. But less is more. Plot them as separates
series of points connected by lines, different colors for the two different
series. Or as two trellises plots. You may also wish to overlay a smooth to
help the reader see the "trend"(e.g via a loess or other nonparametric
smooth, or perhaps just a fitted line).
The only part of a bar that conveys information is the top. The rest of the
fill is "chartjunk" (Tufte's term) and distracts.
Bert Gunter
Genentech Nonclinical Biostatistics
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Ian Bentley
Sent: Friday, June 11, 2010 12:15 PM
To: Bert Gunter
Cc: r-help at r-project.org; Hadley Wickham
Subject: Re: [R] Transforming simulation data which is spread
acrossmanyfiles into a barplot
I'm not trying to see the relation between sent and received, but rather to
show how these grow across the increasing complexity of the 50 data points.
On 11 June 2010 15:02, Bert Gunter <gunter.berton at gene.com> wrote:
> Ouch! Lousy plot. Instead, plot the 50 (mean sent, mean received)pairs as
> a
> y vs x scatterplot to see the relationship.
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
> On
> Behalf Of Hadley Wickham
> Sent: Friday, June 11, 2010 11:53 AM
> To: Ian Bentley
> Cc: r-help at r-project.org
> Subject: Re: [R] Transforming simulation data which is spread across
> manyfiles into a barplot
>
> On Fri, Jun 11, 2010 at 1:32 PM, Ian Bentley <ian.bentley at gmail.com>
> wrote:
> > I'm an R newbie, and I'm just trying to use some of it's graphing
> > capabilities, but I'm a bit stuck - basically in massaging the already
> > available data into a format R likes.
> >
> > I have a simulation environment which produces logs, which represent a
> > number of different things. I then run a python script on this data,
and
> > putting it in a nicer format. Essentially, the python script reduces
the
> > number of files by two orders of magnitude.
> >
> > What I'm left with, is a number of files, which each have two columns of
> > data in them.
> > The files look something like this:
> > --1000.log--
> > Sent Received
> > 405.0 3832.0
> > 176.0 1742.0
> > 176.0 1766.0
> > 176.0 1240.0
> > 356.0 3396.0
> > ...
> >
> > This file - called 1000.log - represents a data point at 1000. What I'd
> like
> > to do is to use a loop, to read in 50 or so of these files, and then
> produce
> > a stacked barplot. Ideally, the stacked barplot would have 1 bar per
> file,
> > and two stacks per bar. The first stack would be the mean of the sent,
> and
> > the second would be the mean of the received.
> >
> > I've used a loop to read files in R before, something like this ---
> >
> > for (i in 1:50){
> > tmpFile <- paste(base, i*100, ".log", sep="")
> > tmp <- read.table(tmpFile)
> > }
> >
>
> # Load data
> library(plyr)
>
> paths <- dir(base, pattern = "\\.log", full = TRUE)
> names(paths) <- basename(paths)
>
> df <- ddply(paths, read.table)
>
> # Compute averages:
> avg <- ddply(df, ".id", summarise,
> sent = mean(sent),
> received = mean(received)
>
> You can read more about plyr at http://had.co.nz/plyr.
>
> Hadley
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Ian Bentley
M.Sc. Candidate
Queen's University
Kingston, Ontario
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list