[R] Sum data according to date in sequence

roslinazairimah zakaria ro@||n@ump @end|ng |rom gm@||@com
Sun Nov 5 00:05:40 CET 2023


Hi all,
Thank you very much.
I learn a lot from your suggested solution.



On Sun, Nov 5, 2023 at 12:56 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:

> Às 01:49 de 03/11/2023, roslinazairimah zakaria escreveu:
> > Hi all,
> >
> > This is the data:
> >
> >> dput(head(dt1,20))structure(list(StationName = c("PALO ALTO CA /
> CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1", "PALO ALTO CA / CAMBRIDGE #1",
> > "PALO ALTO CA / CAMBRIDGE #1"), date = c("1/14/2016", "1/14/2016",
> > "1/14/2016", "1/15/2016", "1/15/2016", "1/15/2016", "1/15/2016",
> > "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016", "1/16/2016",
> > "1/16/2016", "1/16/2016", "1/17/2016", "1/17/2016", "1/17/2016",
> > "1/17/2016", "1/17/2016", "1/18/2016"), time = c("12:09", "19:50",
> > "20:22", "8:25", "14:23", "18:17", "21:46", "10:19", "12:12",
> > "14:12", "16:22", "19:16", "19:19", "20:24", "9:54", "12:16",
> > "13:53", "19:03", "22:00", "8:58"), EnergykWh = c(4.680496, 6.272414,
> > 1.032782, 11.004884, 10.096824, 6.658797, 4.808874, 1.469384,
> > 2.996239, 0.303222, 4.988339, 8.131804, 0.117156, 3.285669, 1.175608,
> > 3.677487, 1.068393, 8.820755, 8.138583, 9.0575)), row.names = c(NA,
> > 20L), class = "data.frame")
> >
> >
> > I would like to sum EnergykW data by the date. E.g. all values for
> > EnergykWh on 1/14/2016
> >
> >
> > On Fri, Nov 3, 2023 at 8:10 AM jim holtman <jholtman using gmail.com> wrote:
> >
> >> How about send a 'dput' of some sample data.  My guess is that your date
> >> is 'character' and not 'Date'.
> >>
> >> Thanks
> >>
> >> Jim Holtman
> >> *Data Munger Guru*
> >>
> >>
> >> *What is the problem that you are trying to solve?Tell me what you want
> to
> >> do, not how you want to do it.*
> >>
> >>
> >> On Thu, Nov 2, 2023 at 4:24 PM roslinazairimah zakaria <
> >> roslinaump using gmail.com> wrote:
> >>
> >>> Dear all,
> >>>
> >>> I have this set of data. I would like to sum the EnergykWh according
> date
> >>> sequences.
> >>>
> >>>> head(dt1,20)                   StationName      date  time EnergykWh
> >>> 1  PALO ALTO CA / CAMBRIDGE #1 1/14/2016 12:09  4.680496
> >>> 2  PALO ALTO CA / CAMBRIDGE #1 1/14/2016 19:50  6.272414
> >>> 3  PALO ALTO CA / CAMBRIDGE #1 1/14/2016 20:22  1.032782
> >>> 4  PALO ALTO CA / CAMBRIDGE #1 1/15/2016  8:25 11.004884
> >>> 5  PALO ALTO CA / CAMBRIDGE #1 1/15/2016 14:23 10.096824
> >>> 6  PALO ALTO CA / CAMBRIDGE #1 1/15/2016 18:17  6.658797
> >>> 7  PALO ALTO CA / CAMBRIDGE #1 1/15/2016 21:46  4.808874
> >>> 8  PALO ALTO CA / CAMBRIDGE #1 1/16/2016 10:19  1.469384
> >>> 9  PALO ALTO CA / CAMBRIDGE #1 1/16/2016 12:12  2.996239
> >>> 10 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 14:12  0.303222
> >>> 11 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 16:22  4.988339
> >>> 12 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:16  8.131804
> >>> 13 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 19:19  0.117156
> >>> 14 PALO ALTO CA / CAMBRIDGE #1 1/16/2016 20:24  3.285669
> >>> 15 PALO ALTO CA / CAMBRIDGE #1 1/17/2016  9:54  1.175608
> >>> 16 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 12:16  3.677487
> >>> 17 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 13:53  1.068393
> >>> 18 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 19:03  8.820755
> >>> 19 PALO ALTO CA / CAMBRIDGE #1 1/17/2016 22:00  8.138583
> >>> 20 PALO ALTO CA / CAMBRIDGE #1 1/18/2016  8:58  9.057500
> >>>
> >>> I have tried this:
> >>> library(dplyr)
> >>> sums <- dt1 %>%
> >>>    group_by(date) %>%
> >>>    summarise(EnergykWh = sum(EnergykWh))
> >>>
> >>> head(sums,20)
> >>>
> >>> The date is not by daily sequence but by year sequence.
> >>>
> >>>> head(sums,20)# A tibble: 20 × 2
> >>>     date      EnergykWh
> >>>     <chr>         <dbl> 1 1/1/2017     25.3   2 1/1/2018     61.0   3
> >>> 1/1/2019      0.627 4 1/1/2020     10.7   5 1/10/2017    69.4   6
> >>> 1/10/2018    54.5   7 1/10/2019    49.1   8 1/10/2020    45.9   9
> >>> 1/11/2017    73.9  10 1/11/2018    53.3  11 1/11/2019    93.5  12
> >>> 1/11/2020    66.7  13 1/12/2017    78.6  14 1/12/2018    42.2  15
> >>> 1/12/2019    22.7  16 1/12/2020    80.9  17 1/13/2017    85.6  18
> >>> 1/13/2018    46.4  19 1/13/2019    40.0  20 1/13/2020   121.
> >>>
> >>>
> >>>
> >>> Thank you very much for any help given.
> >>>
> >>>
> >>> --
> >>> *Roslinazairimah Zakaria*
> >>> *Tel: +609-5492370; Fax. No.+609-5492766*
> >>>
> >>> *Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
> >>> roslinaump using gmail.com <roslinaump using gmail.com>*
> >>> Faculty of Industrial Sciences & Technology
> >>> University Malaysia Pahang
> >>> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
> >>>
> >>>          [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >
> Hello,
>
> Here are two solutions.
>
> 1. Base R
>
> Though I don't coerce the date column to class "Date", it seems to work.
>
>
> aggregate(EnergykWh ~ date, dt1, sum)
> #>        date EnergykWh
> #> 1 1/14/2016  11.98569
> #> 2 1/15/2016  32.56938
> #> 3 1/16/2016  21.29181
> #> 4 1/17/2016  22.88083
> #> 5 1/18/2016   9.05750
>
>
> 2. Package dplyr.
> First column date is coerced from class "character" to class "Date".
> Then the grouped sums are computed.
>
>
> suppressPackageStartupMessages(
>    library(dplyr)
> )
>
> dt1 %>%
>    mutate(date = as.Date(date, "%m/%d/%Y")) %>%
>    summarise(EnergykWh = sum(EnergykWh), .by = date)
> #>         date EnergykWh
> #> 1 2016-01-14  11.98569
> #> 2 2016-01-15  32.56938
> #> 3 2016-01-16  21.29181
> #> 4 2016-01-17  22.88083
> #> 5 2016-01-18   9.05750
>
>
> As you can see, the results are the same.
>
> Also, this exact problem is one of the most asked on StackOverflow.
> Maybe you could try searching there for a solution. My code above is
> also exactly the code in [1], though I had already this answer written.
> I only checked after :(.
>
>
> [1]
>
> https://stackoverflow.com/questions/61548758/r-how-sum-values-by-group-by-date
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> --
> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> presença de vírus.
> www.avg.com
>


-- 
*Roslinazairimah Zakaria*
*Tel: +609-5492370; Fax. No.+609-5492766*

*Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
roslinaump using gmail.com <roslinaump using gmail.com>*
Faculty of Industrial Sciences & Technology
University Malaysia Pahang
Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia

	[[alternative HTML version deleted]]



More information about the R-help mailing list