[R] Sum every n (4) observations by group

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Sun Dec 19 21:09:53 CET 2021

```Milu,

Your data seems to be very consistent in that each value of ID has eight
rows. You seem to want to just sum every four so that fits:

ID Date       Value
1   A 4140 0.000207232
2   A 4141 0.000240141
3   A 4142 0.000271414
4   A 4143 0.000258384
5   A 4144 0.000243640
6   A 4145 0.000271480
7   A 4146 0.000280585
8   A 4147 0.000289691
9   B 4140 0.000298797
10  B 4141 0.000307903
11  B 4142 0.000317008
12  B 4143 0.000326114
13  B 4144 0.000335220
14  B 4145 0.000344326
15  B 4146 0.000353431
16  B 4147 0.000362537
17  C 4140 0.000371643
18  C 4141 0.000380749
19  C 4142 0.000389854
20  C 4143 0.000398960
21  C 4144 0.000408066
22  C 4145 0.000417172
23  C 4146 0.000426277
24  C 4147 0.000435383

There are many ways to do what you want, some more general than others, but
one trivial way is to add a column that contains 24 numbers ranging from 1
to 6 like this assuming mydf holds the above:

Here is an example of such a vector:

rep(1:(nrow(mydf)/4), each=4)
[1] 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 6

So you can add a column like:

> mydf\$fours <- rep(1:(nrow(mydf)/4), each=4)
> mydf
ID Date       Value fours
1   A 4140 0.000207232     1
2   A 4141 0.000240141     1
3   A 4142 0.000271414     1
4   A 4143 0.000258384     1
5   A 4144 0.000243640     2
6   A 4145 0.000271480     2
7   A 4146 0.000280585     2
8   A 4147 0.000289691     2
9   B 4140 0.000298797     3
10  B 4141 0.000307903     3
11  B 4142 0.000317008     3
12  B 4143 0.000326114     3
13  B 4144 0.000335220     4
14  B 4145 0.000344326     4
15  B 4146 0.000353431     4
16  B 4147 0.000362537     4
17  C 4140 0.000371643     5
18  C 4141 0.000380749     5
19  C 4142 0.000389854     5
20  C 4143 0.000398960     5
21  C 4144 0.000408066     6
22  C 4145 0.000417172     6
23  C 4146 0.000426277     6
24  C 4147 0.000435383     6

You now use grouping any way you want to apply a function and in this case
you want a sum. I like to use the tidyverse functions so will show that as
in:

mydf %>%
group_by(ID, fours) %>%
summarize(sums=sum(Value), n=n())

I threw in the extra column in case your data sometimes does not have 4 at
the end of a group or beginning of next. Here is the output:

# A tibble: 6 x 4
# Groups:   ID [3]
ID    fours     sums     n
<chr> <int>    <dbl> <int>
1 A         1 0.000977     4
2 A         2 0.00109      4
3 B         3 0.00125      4
4 B         4 0.00140      4
5 C         5 0.00154      4
6 C         6 0.00169      4

Of course there are all kinds of ways to do this in standard R, including
trivial ones like looping over indices starting at 1 and taking four at a
time and getting the Value data for mydf\$Value[N] + mydf\$Value[N+1] ...

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Miluji Sb
Sent: Sunday, December 19, 2021 1:32 PM
To: r-help mailing list <r-help using r-project.org>
Subject: [R] Sum every n (4) observations by group

Dear all,

I have a dataset (below) by ID and time sequence. I would like to sum every
four observations by ID.

I am confused how to combine the two conditions. Any help will be highly
appreciated. Thank you!

Best.

Milu

## Dataset
structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B",
"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C"), Date =
c(4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L, 4147L, 4140L, 4141L,
4142L, 4143L, 4144L, 4145L, 4146L, 4147L, 4140L, 4141L, 4142L, 4143L, 4144L,
4145L, 4146L, 4147L ), Value = c(0.000207232, 0.000240141, 0.000271414,
0.000258384, 0.00024364, 0.00027148, 0.000280585, 0.000289691, 0.000298797,
0.000307903, 0.000317008, 0.000326114, 0.00033522, 0.000344326, 0.000353431,
0.000362537, 0.000371643, 0.000380749, 0.000389854, 0.00039896, 0.000408066,
0.000417172, 0.000426277, 0.000435383 )), class = "data.frame", row.names =
c(NA, -24L))

[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help