[R] getting data into correct format for summarizing ... reshape, aggregate, or...

stephen sefick ssefick at gmail.com
Mon Sep 15 18:14:40 CEST 2008

I would like to reformat this data frame into something that I can
produce some descriptive statistics.  I have been playing around with
the reshape package and maybe this is not the best way to proceed.  I
would like to use RiverMile and constituent as the grouping variables
to get the summary statistics:

198a    198b
mean   mean
sd       sd
...        ...

etc. for all of these.
I have tried reshape and aggregate and I am sure that I am missing something...

below is a naive attempt at making a data frame with the columns in
the correct class-  This can be improved also.  There are NA in the
real data set, but I didn't know how to randomly intersperse NA in a
created matrix.  I hope this makes sense.  If it doesn't I will go
back to the drawing board and try and clarify this.

value <- rnorm(30)
RiverMile <- c(rep(215, length.out=10), rep(202, length.out=10),
rep(198, length.out=10))
constituent <- c (rep("a", length.out=5), rep("b", length.out=5),
rep("a", length.out=5), rep("b", length.out=5), rep("a",
length.out=5), rep("b", length.out=5))
df <- cbind(as.integer(RiverMile), as.factor(constituent), as.numeric(value))
df.1 <- as.data.frame(df)
df.1[,"V1"] <- as.integer(df.1[,"V1"])
df.1[,"V2"] <- as.factor(df.1[,"V2"])
df.1[,"V3"] <- as.numeric(df.1[,"V3"])
colnames(df.1) <- c("RiverMile", "constituent", "value")

Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

	-K. Mullis

More information about the R-help mailing list