[R] Extract time and state of charge (Start and End) and Count
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Tue Jul 19 19:03:10 CEST 2022
Hello,
There was a bug in the way I copied&pasted your data to my R session,
hence the NA's.
Here is a tidyverse way of doing what you want. Its output matches the
expected output in your last post. The column names don't start at zero
because there was no Starting_SoC_of_12 equal to 0.
library(tidyverse)
result <- dt_2014 %>%
mutate(Hour = lubridate::hour(BatteryChargeStartDate)) %>%
group_by(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>%
mutate(Count = n()) %>%
ungroup() %>%
arrange(Hour, Starting_SoC_of_12, Ending_SoC_of_12) %>%
pivot_wider(
id_cols = Hour,
names_from = c(Starting_SoC_of_12, Ending_SoC_of_12),
names_sep = "-",
values_from = Count,
values_fill = 0L
)
i <- str_order(names(result)[-1], numeric = TRUE)
result <- cbind(result[1], result[-1][i])
result
# Hour 1-11 2-10 2-11 4-4 4-12 5-8 8-12
#1 7 0 0 0 0 0 1 0
#2 8 0 0 0 0 1 0 1
#3 15 0 0 0 1 0 0 0
#4 16 1 1 0 0 0 0 0
#5 18 0 0 1 0 0 0 1
#6 21 0 0 0 0 1 0 1
Hope this helps,
Rui Barradas
Às 16:10 de 19/07/2022, roslinazairimah zakaria escreveu:
> Hi Rui,
> I try to run your code, but all data became NA. Not sure why...
> # these columns need to be fixed
> cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate")
> dt[cols] <- lapply(dt[cols], \(x) sub("\n", " ", x))
>
> # use package lubridate to coerce to a datetime class
> library(lubridate)
>
> dt <- lapply(dt, lubridate::dmy_hm)
> dt
> dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)
>
> h <- lubridate::hour(dt[["BatteryChargeStartDate"]])
> aggregate(Starting_SoC_of_12 ~ h, dt, length)
>
>
> $BCStartTime
> [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [33] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [65] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [97] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [129] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [161] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [193] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [225] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [257] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [289] NA NA NA NA NA NA NA NA
>
> [929] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [961] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
> NA NA NA NA NA NA NA NA NA
> [993] NA NA NA NA NA NA NA NA
> [ reached 'max' / getOption("max.print") -- omitted 34418 entries ]
>
>> dt[cols] <- lapply(dt[cols], lubridate::dmy_hm)
> Warning messages:
> 1: All formats failed to parse. No formats found.
> 2: All formats failed to parse. No formats found.
>>
>> h <- lubridate::hour(dt[["BatteryChargeStartDate"]])
>> aggregate(Starting_SoC_of_12 ~ h, dt, length)
> Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) :
> no rows to aggregate
>
> On Tue, Jul 19, 2022 at 4:53 PM roslinazairimah zakaria <
> roslinaump using gmail.com> wrote:
>
>> Hi Rui,
>>
>> Yes, I would like to count for each hour, how many in the state of charge
>> start 0 and SOC 12, then SOC 1 and SOC 12 and so on.
>>
>> Thank you for your help.
>>
>> On Tue, Jul 19, 2022 at 1:11 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>
>>> Hello,
>>>
>>> I'm not sure I understand the problem. Do you want counts of how many
>>> rows are there per hour?
>>>
>>>
>>> # these columns need to be fixed
>>> cols <- c("BatteryChargeStartDate", "BatteryChargeStopDate")
>>> dt_2014[cols] <- lapply(dt_2014[cols], \(x) sub("\n", " ", x))
>>> # use package lubridate to coerce to a datetime class
>>> dt_2014[cols] <- lapply(dt_2014[cols], lubridate::dmy_hm)
>>>
>>> h <- lubridate::hour(dt_2014[["BatteryChargeStartDate"]])
>>> aggregate(Starting_SoC_of_12 ~ h, dt_2014, length)
>>>
>>>
>>>
>>> It would be better if you post the expected output corresponding to the
>>> posted data set.
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Às 05:04 de 18/07/2022, roslinazairimah zakaria escreveu:
>>>> Dear all,
>>>>
>>>> I have data of Battery Electric vehicle (BEV). I would like to extract
>>> data
>>>> from every hour starting from 0.00 to 0.59, 1:00-1:59 for SOC(state of
>>>> charge) start to end.
>>>>
>>>> Some examples:
>>>> I can extract data from SOC=0 and SOC=12
>>>> dt_2014[which(dt_2014$Starting_SoC_of_12==0 &
>>>> dt_2014$Ending_SoC_of_12==12),]
>>>>
>>>> I can extract data from SOC=1 and SOC=12
>>>> dt_2014[which(dt_2014$Starting_SoC_of_12==1 &
>>>> dt_2014$Ending_SoC_of_12==12),]
>>>>
>>>> and I would like to further categorise the data by hour and count how
>>> many
>>>> cars from 0 state charge to 12 state charge at in that particular hour.
>>>>
>>>> Thank you so much for any help given.
>>>>
>>>> Some data
>>>>> dput(dt_2014[1:10,])
>>>> structure(list(ï..CarID = c("GC10", "GC10", "GC10", "GC10", "GC10",
>>>> "GC10", "GC10", "GC10", "GC10", "GC10"), BatteryChargeStartDate =
>>>> c("16/2/2014 16:05",
>>>> "16/2/2014 18:20", "17/2/2014 8:10", "18/2/2014 7:41", "18/2/2014
>>> 15:36",
>>>> "18/2/2014 16:36", "18/2/2014 21:26", "19/2/2014 8:57", "19/2/2014
>>> 21:08",
>>>> "20/2/2014 18:11"), BCStartTime = c("16:05", "18:20", "8:10",
>>>> "7:41", "15:36", "16:36", "21:26", "8:57", "21:08", "18:11"),
>>>> Year = c(2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L,
>>>> 2014L, 2014L, 2014L), Month = c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
>>>> 2L, 2L, 2L), Day = c(16L, 16L, 17L, 18L, 18L, 18L, 18L, 19L,
>>>> 19L, 20L), BatteryChargeStopDate = c("16/2/2014 17:05", "16/2/2014
>>>> 19:00",
>>>> "17/2/2014 15:57", "18/2/2014 9:52", "18/2/2014 15:39", "18/2/2014
>>>> 17:36",
>>>> "19/2/2014 1:55", "19/2/2014 14:25", "20/2/2014 5:17", "20/2/2014
>>> 23:20"
>>>> ), BCStopTime = c("17:05", "19:00", "15:57", "9:52", "15:39",
>>>> "17:36", "1:55", "14:25", "5:17", "23:20"), Year2 = c(2014L,
>>>> 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L, 2014L
>>>> ), Month2 = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Day2 =
>>> c(16L,
>>>> 16L, 17L, 18L, 18L, 18L, 19L, 19L, 20L, 20L), Starting_SoC_of_12 =
>>>> c(1L,
>>>> 2L, 4L, 5L, 4L, 2L, 8L, 8L, 4L, 8L), Ending_SoC_of_12 = c(11L,
>>>> 11L, 12L, 8L, 4L, 10L, 12L, 12L, 12L, 12L)), row.names = c(NA,
>>>> 10L), class = "data.frame")
>>>>
>>>>
>>>
>>
>>
>> --
>> *Roslinazairimah Zakaria*
>> *Tel: +609-5492370; Fax. No.+609-5492766*
>>
>> *Email: roslinazairimah using ump.edu.my <roslinazairimah using ump.edu.my>;
>> roslinaump using gmail.com <roslinaump using gmail.com>*
>> Faculty of Industrial Sciences & Technology
>> University Malaysia Pahang
>> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>>
>
>
More information about the R-help
mailing list