[R] Sum every n (4) observations by group

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Sun Dec 19 21:09:53 CET 2021


Milu,

Your data seems to be very consistent in that each value of ID has eight
rows. You seem to want to just sum every four so that fits:

   ID Date       Value
1   A 4140 0.000207232
2   A 4141 0.000240141
3   A 4142 0.000271414
4   A 4143 0.000258384
5   A 4144 0.000243640
6   A 4145 0.000271480
7   A 4146 0.000280585
8   A 4147 0.000289691
9   B 4140 0.000298797
10  B 4141 0.000307903
11  B 4142 0.000317008
12  B 4143 0.000326114
13  B 4144 0.000335220
14  B 4145 0.000344326
15  B 4146 0.000353431
16  B 4147 0.000362537
17  C 4140 0.000371643
18  C 4141 0.000380749
19  C 4142 0.000389854
20  C 4143 0.000398960
21  C 4144 0.000408066
22  C 4145 0.000417172
23  C 4146 0.000426277
24  C 4147 0.000435383

There are many ways to do what you want, some more general than others, but
one trivial way is to add a column that contains 24 numbers ranging from 1
to 6 like this assuming mydf holds the above:

Here is an example of such a vector:

rep(1:(nrow(mydf)/4), each=4)
 [1] 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 6

So you can add a column like:

> mydf$fours <- rep(1:(nrow(mydf)/4), each=4)
> mydf
   ID Date       Value fours
1   A 4140 0.000207232     1
2   A 4141 0.000240141     1
3   A 4142 0.000271414     1
4   A 4143 0.000258384     1
5   A 4144 0.000243640     2
6   A 4145 0.000271480     2
7   A 4146 0.000280585     2
8   A 4147 0.000289691     2
9   B 4140 0.000298797     3
10  B 4141 0.000307903     3
11  B 4142 0.000317008     3
12  B 4143 0.000326114     3
13  B 4144 0.000335220     4
14  B 4145 0.000344326     4
15  B 4146 0.000353431     4
16  B 4147 0.000362537     4
17  C 4140 0.000371643     5
18  C 4141 0.000380749     5
19  C 4142 0.000389854     5
20  C 4143 0.000398960     5
21  C 4144 0.000408066     6
22  C 4145 0.000417172     6
23  C 4146 0.000426277     6
24  C 4147 0.000435383     6

You now use grouping any way you want to apply a function and in this case
you want a sum. I like to use the tidyverse functions so will show that as
in:

mydf %>%
  group_by(ID, fours) %>%
  summarize(sums=sum(Value), n=n())

I threw in the extra column in case your data sometimes does not have 4 at
the end of a group or beginning of next. Here is the output:

# A tibble: 6 x 4
# Groups:   ID [3]
ID    fours     sums     n
<chr> <int>    <dbl> <int>
  1 A         1 0.000977     4
2 A         2 0.00109      4
3 B         3 0.00125      4
4 B         4 0.00140      4
5 C         5 0.00154      4
6 C         6 0.00169      4

Of course there are all kinds of ways to do this in standard R, including
trivial ones like looping over indices starting at 1 and taking four at a
time and getting the Value data for mydf$Value[N] + mydf$Value[N+1] ...



-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of Miluji Sb
Sent: Sunday, December 19, 2021 1:32 PM
To: r-help mailing list <r-help using r-project.org>
Subject: [R] Sum every n (4) observations by group

Dear all,

I have a dataset (below) by ID and time sequence. I would like to sum every
four observations by ID.

I am confused how to combine the two conditions. Any help will be highly
appreciated. Thank you!

Best.

Milu

## Dataset
structure(list(ID = c("A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B",
"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C"), Date =
c(4140L, 4141L, 4142L, 4143L, 4144L, 4145L, 4146L, 4147L, 4140L, 4141L,
4142L, 4143L, 4144L, 4145L, 4146L, 4147L, 4140L, 4141L, 4142L, 4143L, 4144L,
4145L, 4146L, 4147L ), Value = c(0.000207232, 0.000240141, 0.000271414,
0.000258384, 0.00024364, 0.00027148, 0.000280585, 0.000289691, 0.000298797,
0.000307903, 0.000317008, 0.000326114, 0.00033522, 0.000344326, 0.000353431,
0.000362537, 0.000371643, 0.000380749, 0.000389854, 0.00039896, 0.000408066,
0.000417172, 0.000426277, 0.000435383 )), class = "data.frame", row.names =
c(NA, -24L))

	[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list