[R] summarizing replicates with multiple treatments

Levi Waldron leviwaldron at gmail.com
Tue Mar 4 23:24:54 CET 2008


I have a dataframe with several different treatment variables, and
would like to calculate the mean and standard deviation of the
replicates for each day and treatment variable.  It seems like it
should be easy, but I've only managed to do it for one treatment at a
time using subset and tapply.  Here is an example dataset:

> `exampledata` <-
structure(list(day = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), treat = structure(c(1L, 1L,
1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L
), .Label = c("a", "b"), class = "factor"), replicate = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L), height = c(1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 2.1, 2.2, 2.3,
2.4, 2.5, 2.6, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6), weight = c(11.1,
11.2, 11.3, 11.4, 11.5, 11.6, 12.1, 12.2, 12.3, 12.4, 12.5, 12.6,
13.1, 13.2, 13.3, 13.4, 13.5, 13.6)), .Names = c("day", "treat",
"replicate", "height", "weight"), class = "data.frame", row.names = c(NA,
-18L))

> exampledata
   day treat replicate height weight
1    1     a         1    1.1   11.1
2    1     a         2    1.2   11.2
3    1     a         3    1.3   11.3
4    1     b         1    1.4   11.4
5    1     b         2    1.5   11.5
6    1     b         3    1.6   11.6
7    2     a         1    2.1   12.1
8    2     a         2    2.2   12.2
9    2     a         3    2.3   12.3
10   2     b         1    2.4   12.4
11   2     b         2    2.5   12.5
12   2     b         3    2.6   12.6
13   3     a         1    3.1   13.1
14   3     a         2    3.2   13.2
15   3     a         3    3.3   13.3
16   3     b         1    3.4   13.4
17   3     b         2    3.5   13.5
18   3     b         3    3.6   13.6

I would like to combine the replicates and get a dataframe like:

day	treat	height.mean	height.sd	weight.mean	weight.sd
1	a	1.2	0.1	11.2	0.1
1	b	1.5	0.1	11.5	0.1
2	a	2.2	0.1	12.2	0.1
2	b	2.5	0.1	12.5	0.1
3	a	3.2	0.1	13.2	0.1
3	b	3.5	0.1	13.5	0.1

or two dataframes, one with means and the other with standard deviations.

Thus far I have been doing it a piece at a time, like below (extra
verbose since tapply doesn't accept the data= argument!), but would
like to do it for all the measurement columns and all the treatments
in one go.  Thanks!

> tapply(exampledata[exampledata$treat=="a",]$height,exampledata[exampledata$treat=="a",]$day,mean)
  1   2   3
1.2 2.2 3.2
>



More information about the R-help mailing list