[R] extracting and summarizing data from data.frame with table-like output
Kim Milferstedt
milferst at uiuc.edu
Wed Feb 1 23:12:29 CET 2006
Hello,
I would like to summarize and extract statistics (calculate means,
stderr etc) from a data set that comes as a large table. This table
needs to be sorted according to certain categories (in the example
below "day", "angle", "distance" and "location"). I would like to
have the output in a table similar to the original data, but now with
the mean (or stderr etc) for all individual measurement of "res.1" at
any "location" in one column, in a second colum the means for "res.2"
at any "location" etc.
1. How can I get the means that I extract from my data set into a
format similar to the initial data and not as a matrix as I get it
with tapply()?
2. How can I calculate the means for many variables at once and not
just for one as in tapply()?
3. How does R know what data to use with the function "order()" or "tapply()"?
Thanks for your help,
Kim
Here is an example:
## The code below creates an artificial data set "jj" that resembles
my real data.
res.1 <- c(rbinom(36, size=50, prob=0.6))
res.2 <- c(rbinom(36, size=20, prob=0.4))
day <- rep(rep(1:3,rep(6,3)),2)
angle <- rep(1:3, 12)
distance <- rep(rep(1:2,rep(3,2)),6)
location <- rep(1:2,c(18,18))
jj <- cbind(res.1,res.2,day,angle,distance, location)
## I order the data
pp <- order(day, angle, distance)
jj[pp,]
## Now I calculate the mean for "res.1" over the variable "location"
ss <- tapply(res.1,list(day, angle, distance, location), mean)
ss
## The result "ss" are four matrices but I want a table like output,
possibly also with the means for res.2.
__________________________________________
Kim Milferstedt
University of Illinois at Urbana-Champaign
Department of Civil and Environmental Engineering
4125 Newmark Civil Engineering Building
205 North Mathews Avenue MC 250
Urbana, IL 61801
USA
phone: (001) 217 333-9663
fax: (001) 217 333-6968
email: milferst at uiuc.edu
http://cee.uiuc.edu/research/morgenroth/index.asp
More information about the R-help
mailing list