[R] How to calculate means for multiple variables in samples with different sizes

jim holtman jholtman at gmail.com
Fri Mar 11 11:51:51 CET 2011


use the package 'data.table'

> x <- read.table(textConnection("sample    replicate    height    weight    age
+ A    1.00    12.0    0.64    6.00
+ A    2.00    12.2    0.38    6.00
+ A    3.00    12.4    0.49    6.00
+ B    1.00    12.7    0.65    4.00
+ B    2.00    12.8    0.78    5.00
+ C    1.00    11.9    0.45    6.00
+ C    2.00    11.84    0.44    2.00
+ C    3.00    11.43    0.32    3.00
+ C    4.00    10.24    0.84    4.00
+ D    1.00    14.2    0.54    2.00
+ D    2.00    15.67    0.67    7.00
+ D    3.00    15.11    0.81    7.00"), header = TRUE)
> closeAllConnections()
> require(data.table)
> x.dt <- data.table(x)  # convert
> x.dt[, list(height = mean(height)
+            , weight = mean(weight)
+            , age = mean(age)
+            ), by = sample]
     sample   height    weight      age
[1,]      A 12.20000 0.5033333 6.000000
[2,]      B 12.75000 0.7150000 4.500000
[3,]      C 11.35250 0.5125000 3.750000
[4,]      D 14.99333 0.6733333 5.333333
>


On Fri, Mar 11, 2011 at 4:32 AM, Aline Santos <alinexss at gmail.com> wrote:
> Hello R-helpers:
>
> I have data like this:
>
> sample    replicate    height    weight    age
> A    1.00    12.0    0.64    6.00
> A    2.00    12.2    0.38    6.00
> A    3.00    12.4    0.49    6.00
> B    1.00    12.7    0.65    4.00
> B    2.00    12.8    0.78    5.00
> C    1.00    11.9    0.45    6.00
> C    2.00    11.84    0.44    2.00
> C    3.00    11.43    0.32    3.00
> C    4.00    10.24    0.84    4.00
> D    1.00    14.2    0.54    2.00
> D    2.00    15.67    0.67    7.00
> D    3.00    15.11    0.81    7.00
>
> Now, how can I calculate the mean for each condition (heigth, weigth, age)
> in each sample, considering the samples have different number of replicates?
>
>
> The final matrix should look like:
>
> sample    height    weight    age
> A    12.20    0.50    6.00
> B     12.75      0.72      4.50
> C     11.35      0.51      3.75
> D     14.99      0.67      5.33
>
> This is a simplified version of my dataset, which consist of 100 samples
> (unequally distributed in 530 replicates) for 600 different conditions.
>
> I appreciate all the help.
>
> A.S.
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list