[R] Extract time and state of charge (Start and End) and Count

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Tue Jul 19 19:03:10 CEST 2022


Hello,

There was a bug in the way I copied&pasted your data to my R session, 
hence the NA's.

Here is a tidyverse way of doing what you want. Its output matches the 
expected output in your last post. The column names don't start at zero 
because there was no Starting_SoC_of_12 equal to 0.


library(tidyverse)

result <- dt_2014 %>%
   mutate(Hour = lubridate::hour(BatteryChargeStartDate)) %>%
   group_by(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>%
   mutate(Count = n()) %>%
   ungroup() %>%
   arrange(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>%
   pivot_wider(
     id_cols = Hour,
     names_from = c(Starting_SoC_of_12, Ending_SoC_of_12),
     names_sep = "-",
     values_from = Count,
     values_fill = 0L
   )

i <- str_order(names(result)[-1], numeric = TRUE)
result <- cbind(result[1], result[-1][i])
result
#  Hour 1-11 2-10 2-11 4-4 4-12 5-8 8-12
#1    7    0    0    0   0    0   1    0
#2    8    0    0    0   0    1   0    1
#3   15    0    0    0   1    0   0    0
#4   16    1    1    0   0    0   0    0
#5   18    0    0    1   0    0   0    1
#6   21    0    0    0   0    1   0    1


Hope this helps,

Rui Barradas


Às 16:10 de 19/07/2022, roslinazairimah zakaria escreveu:
> Hi Rui,
> I try to run your code, but all data became NA. Not sure why...
> # these columns need to be fixed
> cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate")
> dt[cols] <- lapply(dt[cols], \(x) sub("\n", " ", x))
> 
> # use package lubridate to coerce to a datetime class
> library(lubridate)
> 
> dt <- lapply(dt, lubridate::dmy_hm)
> dt
> dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)
> 
> h <- lubridate::hour(dt[["BatteryChargeStartDate"]])
> aggregate(Starting_SoC_of_12 ~ h, dt, length)
> 
> 
> $BCStartTime
>     [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>    [33] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>    [65] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>    [97] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [129] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [161] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [193] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [225] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [257] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [289] NA NA NA NA NA NA NA NA
> 
>   [929] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [961] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
>   [993] NA NA NA NA NA NA NA NA
>   [ reached 'max' / getOption("max.print") -- omitted 34418 entries ]
> 
>> dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)
> Warning messages:
> 1: All formats failed to parse. No formats found.
> 2: All formats failed to parse. No formats found.
>>
>> h <- lubridate::hour(dt[["BatteryChargeStartDate"]])
>> aggregate(Starting_SoC_of_12 ~ h, dt, length)
> Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) :
>    no rows to aggregate
> 
> On Tue, Jul 19, 2022 at 4:53 PM roslinazairimah zakaria <
> roslinaump using gmail.com> wrote:
> 
>> Hi Rui,
>>
>> Yes, I would like to count for each hour, how many in the state of charge
>> start 0 and SOC 12, then SOC 1 and SOC 12 and so on.
>>
>> Thank you for your help.
>>
>> On Tue, Jul 19, 2022 at 1:11 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>
>>> Hello,
>>>
>>> I'm not sure I understand the problem. Do you want counts of how many
>>> rows are there per hour?
>>>
>>>
>>> # these columns need to be fixed
>>> cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate")
>>> dt_2014[cols] <- lapply(dt_2014[cols], \(x) sub("\n", " ", x))
>>> # use package lubridate to coerce to a datetime class
>>> dt_2014[cols] <- lapply(dt_2014[cols], lubridate::dmy_hm)
>>>
>>> h <- lubridate::hour(dt_2014[["BatteryChargeStartDate"]])
>>> aggregate(Starting_SoC_of_12 ~ h, dt_2014, length)
>>>
>>>
>>>
>>> It would be better if you post the expected output corresponding to the
>>> posted data set.
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Às 05:04 de 18/07/2022, roslinazairimah zakaria escreveu:
>>>> Dear all,
>>>>
>>>> I have data of Battery Electric vehicle (BEV). I would like to extract
>>> data
>>>> from every hour starting from 0.00 to 0.59, 1:00-1:59 for SOC(state of
>>>> charge) start to end.
>>>>
>>>> Some examples:
>>>> I can extract data from SOC=0 and SOC=12
>>>> dt_2014[which(dt_2014$Starting_SoC_of_12==0 &
>>>> dt_2014$Ending_SoC_of_12==12),]
>>>>
>>>> I can extract data from SOC=1 and SOC=12
>>>> dt_2014[which(dt_2014$Starting_SoC_of_12==1 &
>>>> dt_2014$Ending_SoC_of_12==12),]
>>>>
>>>> and I would like to further categorise the data by hour and count how
>>> many
>>>> cars from 0 state charge to 12 state charge at in that particular hour.
>>>>
>>>> Thank you so much for any help given.
>>>>
>>>> Some data
>>>>> dput(dt_2014[1:10,])
>>>> structure(list(ï..CarID = c("GC10", "GC10", "GC10", "GC10", "GC10",
>>>> "GC10", "GC10", "GC10", "GC10", "GC10"), BatteryChargeStartDate =
>>>> c("16/2/2014 16:05",
>>>> "16/2/2014 18:20", "17/2/2014 8:10", "18/2/2014 7:41", "18/2/2014
>>> 15:36",
>>>> "18/2/2014 16:36", "18/2/2014 21:26", "19/2/2014 8:57", "19/2/2014
>>> 21:08",
>>>> "20/2/2014 18:11"), BCStartTime = c("16:05", "18:20", "8:10",
>>>> "7:41", "15:36", "16:36", "21:26", "8:57", "21:08", "18:11"),
>>>>       Year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
>>>>       2014L, 2014L, 2014L), Month = c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
>>>>       2L, 2L, 2L), Day = c(16L, 16L, 17L, 18L, 18L, 18L, 18L, 19L,
>>>>       19L, 20L), BatteryChargeStopDate = c("16/2/2014 17:05", "16/2/2014
>>>> 19:00",
>>>>       "17/2/2014 15:57", "18/2/2014 9:52", "18/2/2014 15:39", "18/2/2014
>>>> 17:36",
>>>>       "19/2/2014 1:55", "19/2/2014 14:25", "20/2/2014 5:17", "20/2/2014
>>> 23:20"
>>>>       ), BCStopTime = c("17:05", "19:00", "15:57", "9:52", "15:39",
>>>>       "17:36", "1:55", "14:25", "5:17", "23:20"), Year2 = c(2014L,
>>>>       2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L
>>>>       ), Month2 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Day2 =
>>> c(16L,
>>>>       16L, 17L, 18L, 18L, 18L, 19L, 19L, 20L, 20L), Starting_SoC_of_12 =
>>>> c(1L,
>>>>       2L, 4L, 5L, 4L, 2L, 8L, 8L, 4L, 8L), Ending_SoC_of_12 = c(11L,
>>>>       11L, 12L, 8L, 4L, 10L, 12L, 12L, 12L, 12L)), row.names = c(NA,
>>>> 10L), class = "data.frame")
>>>>
>>>>
>>>
>>
>>
>> --
>> *Roslinazairimah Zakaria*
>> *Tel: +609-5492370; Fax. No.+609-5492766*
>>
>> *Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
>> roslinaump using gmail.com <roslinaump using gmail.com>*
>> Faculty of Industrial Sciences & Technology
>> University Malaysia Pahang
>> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>>
> 
>



More information about the R-help mailing list