[R] data manipulation: getting mean value every 5 rows

Tony Plate tplate at blackmesacapital.com
Tue Jul 29 00:12:31 CEST 2003


 > x <- read.table(file("clipboard"), header=T)
 > # add an extra field to define groups of 5 sequential rows
 > x[,"code"] <- rep(seq(len=nrow(x)/5), each=5)
 > x
    temp line cage   number code
1    18   18    1 6678.630    1
2    18   18    1 7774.458    1
3    18   18    1 7845.902    1
4    18   18    1 9483.578    1
5    18   18    1 8983.555    1
6    18   18    1 9181.052    2
7    18   18    1 9458.696    2
8    18   18    1 8138.616    2
9    18   18    1 7981.994    2
10   18   18    1 7556.491    2
11   18   18    1 7672.137    3
12   18   18    1 6607.776    3
13   18   18    1 8383.650    3
14   18   18    1 7129.852    3
15   18   18    1 8536.667    3
16   18   18    2 8287.800    4
17   18   18    2 7924.470    4
18   18   18    2 7928.474    4
19   18   18    2 7363.157    4
20   18   18    2 7952.593    4
 > aggregate(x[,"number",drop=F], x[,c("temp", "line", "cage", "code")], mean)
   temp line cage code   number
1   18   18    1    1 8153.225
2   18   18    1    2 8463.370
3   18   18    1    3 7666.016
4   18   18    2    4 7891.299
 > # result has an additional column named "code" -- easily eliminated

At Monday 10:47 PM 7/28/2003 +0100, you wrote:
>Dear All,
>
>I would like to ask you how to accomplish a little tricky data
>manipulation. I have a large dataset, looking something like:
>
>temp    line    cage    number
>18      18      1       6678.63
>18      18      1       7774.458
>18      18      1       7845.902
>18      18      1       9483.578
>18      18      1       8983.555
>18      18      1       9181.052
>18      18      1       9458.696
>18      18      1       8138.616
>18      18      1       7981.994
>18      18      1       7556.491
>18      18      1       7672.137
>18      18      1       6607.776
>18      18      1       8383.65
>18      18      1       7129.852
>18      18      1       8536.667
>18      18      2       8287.8
>18      18      2       7924.47
>18      18      2       7928.474
>18      18      2       7363.157
>18      18      2       7952.593
>.....
>
>I would like to create a dataframe where I get the mean values, 5 rows at a
>time, of columns "number", while keeping the value in the other columns
>fixed to the vaules found in the first of the 5 rows (or whatever, it's the
>same for the 5 rows) so that the above would be "shrunk" to:
>
>temp    line    cage    number
>18      18      1       8153.2246
>18      18      1       8463.3698
>18      18      1       7666.0164
>18      18      2       7891.2988
>
>Any hints?
>
>Regards,
>
>Federico Calboli
>
>=========================
>
>Federico C.F. Calboli
>
>Department of Biology
>University College London
>Room 327
>Darwin Building
>Gower Street
>London
>WClE 6BT
>
>Tel: (+44) 020 7679 4395
>Fax (+44) 020 7679 7096
>f.calboli at ucl.ac.uk
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Tony Plate   tplate at acm.org




More information about the R-help mailing list