# [R] change frequency of wind data correctly

Ben Tupper btupper @end|ng |rom b|ge|ow@org
Sun Dec 6 23:51:41 CET 2020

```Hi,

Perhaps this might work for you.  It leverages findInterval() and a
simple look-up-table of times to do the grouping.  I made it return NA
when computing the mean when there are fewer than the three
observations.

Cheers,
Ben

n <- 144
x <- data.frame(
datetime = seq(from = as.POSIXct("2018-02-01 00:00:00", tz = "UTC"),
by = "10 min",
length = n),
vmax = sample(10:50, n, replace = TRUE)
)

lut <- seq(from = x\$datetime[1],
to = x\$datetime[n],
by = "30 min") + 1     # add one second so that 00 sorts
with 40, 50, 00
# and the other grouping is 10, 20 30

x\$interval <- findInterval(x\$datetime, lut)
x

y <- aggregate(vmax ~ interval, data = x,
FUN = function(x){
if (length(x) < 3){
r <- NA
} else {
r <- mean(x)
}
r
})
y

On Sun, Dec 6, 2020 at 1:59 PM Stefano Sofia
<stefano.sofia using regione.marche.it> wrote:
>
> Hi Jim.
> I studied and implemented your solution in details. The idea is great, but after a sharp revision I came to the conclusion that unfortunately it des not work correctly: for the "am" side (10, 20, 30 minutes) it works well because the hour is exactly the same, while for the "pm" side (40, 50, 00) the algorithm it doesn't because the hour related to 40 and 50 minutes is different from the hour related to 00 (which is the following one). Am I wrong?
> I tried to fix it keeping the easy structure of the algorithm, but with no success.
>
> Any hint for that?
>
> Stefano
>
>
>          (oo)
> --oOO--( )--OOo----------------
> Stefano Sofia PhD
> Civil Protection - Marche Region
> Meteo Section
> Snow Section
> Via del Colle Ameno 5
> 60126 Torrette di Ancona, Ancona
> Uff: 071 806 7743
> E-mail: stefano.sofia using regione.marche.it
> ---Oo---------oO----------------
>
> ________________________________________
> Da: Jim Lemon [drjimlemon using gmail.com]
> Inviato: giovedì 3 dicembre 2020 4.41
> A: Stefano Sofia
> Cc: r-help mailing list
> Oggetto: Re: [R] change frequency of wind data correctly
>
> Hi again,
> Didn't realize that the example didn't even span a full day.
>
>  2018-02-01 00:00:00 27
>  2018-02-01 00:10:00 41
>  2018-02-01 00:20:00 46
>  2018-02-01 00:30:00 39
>  2018-02-01 00:40:00 34
>  2018-02-01 00:50:00 32
>  2018-02-01 01:00:00 37
>  2018-02-01 01:10:00 31
>  2018-02-01 01:20:00 26
>  2018-02-01 01:30:00 29
>  2018-02-01 01:40:00 24
>  2018-02-01 01:50:00 35",
> # extract the hour
> ssdf\$hour<-
>  as.numeric(unlist(lapply(strsplit(ssdf\$time_POSIX,":"),"[",1)))
> # get the time of day as seconds from the time field
> ssdf\$mins<-
>  as.numeric(unlist(lapply(strsplit(ssdf\$time_POSIX,":"),"[",2)))
> # create an AM/PM variable
> ssdf\$ampm<-ifelse(ssdf\$mins > 0 & ssdf\$mins <= 30,"am","pm")
> # drop first row
> ssdf<-ssdf[-1,]
> means<-aggregate(vmax~hour+ampm,ssdf,mean)
>
> This does a full day. To do more, add the date_POSIX field to the
> aggregate command. If you have the date and time in one field you'll
> have to split that. That will distinguish the AM/PM means in each day
> as well as hour.
>
> Jim
>
> On Thu, Dec 3, 2020 at 2:10 PM Jim Lemon <drjimlemon using gmail.com> wrote:
> >
> > Hi Stefano,
> > I read in your date-time as two separate fields for convenience. You
> > can split your single field at the space to get the same result.
> >
> >  2018-02-01 00:00:00 27
> >  2018-02-01 00:10:00 41
> >  2018-02-01 00:20:00 46
> >  2018-02-01 00:30:00 39
> >  2018-02-01 00:40:00 34
> >  2018-02-01 00:50:00 32",
> > # get the time of day as seconds from the time field
> > ssdf\$seconds<-as.numeric(strptime(ssdf\$time_POSIX,"%H:%M:%S"))
> > # subtract whatever current date strptime guesses for the date
> > ssdf\$seconds<-ssdf\$seconds-min(ssdf\$seconds)
> > # create an AM/PM variable
> > ssdf\$ampm<-ifelse(ssdf\$seconds > 0 & ssdf\$seconds <= 1800,"am","pm")
> > means<-aggregate(vmax~ampm,ssdf,mean)
> >
> > Jim
> >
> > On Thu, Dec 3, 2020 at 4:55 AM Stefano Sofia
> > <stefano.sofia using regione.marche.it> wrote:
> > >
> > > Dear list users,
> > > I have wind data with frequency of 10 minutes (three years data). For simplicity let me use only max wind speed.
> > > I need to reduce the frequency to 30 minutes,  at  00 (taking the mean of data at 40, 50 and 00 minutes) and at 30 (taking the mean of data at 10, 20 and 30 minutes) of each hour.
> > >
> > > The simple code here reported works well, but the column "interval" groups data forward, not backward:
> > >
> > > init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
> > > fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
> > > mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins"))
> > > mydf\$vmax <- round(rnorm(13, 35, 10))
> > > mydf\$interval <- cut(mydf\$data_POSIX, , breaks="30 min")
> > > means <- aggregate(vmax ~ interval, mydf, mean)
> > >
> > >     data_POSIX                  vmax  interval
> > > 1  2018-02-01 00:00:00     27     2018-02-01 00:00:00
> > > 2  2018-02-01 00:10:00     41     2018-02-01 00:00:00
> > > 3  2018-02-01 00:20:00     46     2018-02-01 00:00:00
> > > 4  2018-02-01 00:30:00     39     2018-02-01 00:30:00
> > > 5  2018-02-01 00:40:00     34     2018-02-01 00:30:00
> > > 6  2018-02-01 00:50:00     32     2018-02-01 00:30:00
> > > ...
> > >
> > > I should work with
> > >
> > >     data_POSIX                  vmax  interval
> > > 1  2018-02-01 00:00:00     27     2018-02-01 00:00:00
> > > 2  2018-02-01 00:10:00     41     2018-02-01 00:30:00
> > > 3  2018-02-01 00:20:00     46     2018-02-01 00:30:00
> > > 4  2018-02-01 00:30:00     39     2018-02-01 00:30:00
> > > 5  2018-02-01 00:40:00     34     2018-02-01 00:00:00
> > > 6  2018-02-01 00:50:00     32     2018-02-01 00:00:00
> > > ...
> > >
> > >
> > > Is there a way to modify this code to groupp data correctly? (I would prefer using only the base package)
> > >
> > > Thank you for your help
> > > Stefano
> > >
> > >
> > >
> > >          (oo)
> > > --oOO--( )--OOo----------------
> > > Stefano Sofia PhD
> > > Civil Protection - Marche Region
> > > Meteo Section
> > > Snow Section
> > > Via del Colle Ameno 5
> > > 60126 Torrette di Ancona, Ancona
> > > Uff: 071 806 7743
> > > E-mail: stefano.sofia using regione.marche.it
> > > ---Oo---------oO----------------
> > >
> > > ________________________________
> > >
> > > AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere informazioni confidenziali, pertanto è destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si è il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed urgenza, la risposta al presente messaggio di posta elettronica può essere visionata da persone estranee al destinatario.
> > > IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system.
> > >
> > > --
> > > Questo messaggio  stato analizzato da Libra ESVA ed  risultato non infetto.
> > > This message was scanned by Libra ESVA and is believed to be clean.
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >  https://urlsand.esvalabs.com/?u=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&e=52342f8a&h=d46bc785&f=y&p=y
> > > and provide commented, minimal, self-contained, reproducible code.
>
> --
>
> Questo messaggio  stato analizzato con Libra ESVA ed  risultato non infetto.
>
>
> ________________________________
>
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere informazioni confidenziali, pertanto è destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si è il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed urgenza, la risposta al presente messaggio di posta elettronica può essere visionata da persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system.
>
> --
> Questo messaggio  stato analizzato da Libra ESVA ed  risultato non infetto.
> This message was scanned by Libra ESVA and is believed to be clean.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> and provide commented, minimal, self-contained, reproducible code.

--
Ben Tupper
Bigelow Laboratory for Ocean Science
East Boothbay, Maine
http://www.bigelow.org/
https://eco.bigelow.org

```