# [R] change frequency of wind data correctly

```Hi,

Perhaps this might work for you.  It leverages findInterval() and a
simple look-up-table of times to do the grouping.  I made it return NA
when computing the mean when there are fewer than the three
observations.

Cheers,
Ben

n <- 144
x <- data.frame(
datetime = seq(from = as.POSIXct("2018-02-01 00:00:00", tz = "UTC"),
by = "10 min",
length = n),
vmax = sample(10:50, n, replace = TRUE)
)

lut <- seq(from = x\$datetime[1],
to = x\$datetime[n],
by = "30 min") + 1     # add one second so that 00 sorts
with 40, 50, 00
# and the other grouping is 10, 20 30

x\$interval <- findInterval(x\$datetime, lut)
x

y <- aggregate(vmax ~ interval, data = x,
FUN = function(x){
if (length(x) < 3){
r <- NA
} else {
r <- mean(x)
}
r
})
y

>
> Hi Jim.
> I studied and implemented your solution in details. The idea is great, but after a sharp revision I came to the conclusion that unfortunately it des not work correctly: for the "am" side (10, 20, 30 minutes) it works well because the hour is exactly the same, while for the "pm" side (40, 50, 00) the algorithm it doesn't because the hour related to 40 and 50 minutes is different from the hour related to 00 (which is the following one). Am I wrong?
> I tried to fix it keeping the easy structure of the algorithm, but with no success.
>
> Any hint for that?
>
> Stefano
>
>
>
>
> Hi again,
> Didn't realize that the example didn't even span a full day.
>
>  2018-02-01 00:00:00 27
>  2018-02-01 00:10:00 41
>  2018-02-01 00:20:00 46
>  2018-02-01 00:30:00 39
>  2018-02-01 00:40:00 34
>  2018-02-01 00:50:00 32
>  2018-02-01 01:00:00 37
>  2018-02-01 01:10:00 31
>  2018-02-01 01:20:00 26
>  2018-02-01 01:30:00 29
>  2018-02-01 01:40:00 24
>  2018-02-01 01:50:00 35",
> # extract the hour
> ssdf\$hour<-
>  as.numeric(unlist(lapply(strsplit(ssdf\$time_POSIX,":"),"[",1)))
> # get the time of day as seconds from the time field
> ssdf\$mins<-
>  as.numeric(unlist(lapply(strsplit(ssdf\$time_POSIX,":"),"[",2)))
> # create an AM/PM variable
> ssdf\$ampm<-ifelse(ssdf\$mins > 0 & ssdf\$mins <= 30,"am","pm")
> # drop first row
> ssdf<-ssdf[-1,]
> means<-aggregate(vmax~hour+ampm,ssdf,mean)
>
> This does a full day. To do more, add the date_POSIX field to the
> aggregate command. If you have the date and time in one field you'll
> have to split that. That will distinguish the AM/PM means in each day
> as well as hour.
>
> Jim
>
> >
> > Hi Stefano,
> > I read in your date-time as two separate fields for convenience. You
> > can split your single field at the space to get the same result.
> >
> >  2018-02-01 00:00:00 27
> >  2018-02-01 00:10:00 41
> >  2018-02-01 00:20:00 46
> >  2018-02-01 00:30:00 39
> >  2018-02-01 00:40:00 34
> >  2018-02-01 00:50:00 32",
> > # get the time of day as seconds from the time field
> > ssdf\$seconds<-as.numeric(strptime(ssdf\$time_POSIX,"%H:%M:%S"))
> > # subtract whatever current date strptime guesses for the date
> > ssdf\$seconds<-ssdf\$seconds-min(ssdf\$seconds)
> > # create an AM/PM variable
> > ssdf\$ampm<-ifelse(ssdf\$seconds > 0 & ssdf\$seconds <= 1800,"am","pm")
> > means<-aggregate(vmax~ampm,ssdf,mean)
> >
> > Jim
> >
> > >
> > > Dear list users,
> > > I have wind data with frequency of 10 minutes (three years data). For simplicity let me use only max wind speed.
> > > I need to reduce the frequency to 30 minutes,  at  00 (taking the mean of data at 40, 50 and 00 minutes) and at 30 (taking the mean of data at 10, 20 and 30 minutes) of each hour.
> > >
> > > The simple code here reported works well, but the column "interval" groups data forward, not backward:
> > >
> > > init_day <- as.POSIXct("2018-02-01-00-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
> > > fin_day <- as.POSIXct("2018-02-01-02-00", format="%Y-%m-%d-%H-%M", tz="Etc/GMT-1")
> > > mydf <- data.frame(data_POSIX=seq(init_day, fin_day, by="10 mins"))
> > > mydf\$vmax <- round(rnorm(13, 35, 10))
> > > mydf\$interval <- cut(mydf\$data_POSIX, , breaks="30 min")
> > > means <- aggregate(vmax ~ interval, mydf, mean)
> > >
> > >     data_POSIX                  vmax  interval
> > > 1  2018-02-01 00:00:00     27     2018-02-01 00:00:00
> > > 2  2018-02-01 00:10:00     41     2018-02-01 00:00:00
> > > 3  2018-02-01 00:20:00     46     2018-02-01 00:00:00
> > > 4  2018-02-01 00:30:00     39     2018-02-01 00:30:00
> > > 5  2018-02-01 00:40:00     34     2018-02-01 00:30:00
> > > 6  2018-02-01 00:50:00     32     2018-02-01 00:30:00
> > > ...
> > >
> > > I should work with
> > >
> > >     data_POSIX                  vmax  interval
> > > 1  2018-02-01 00:00:00     27     2018-02-01 00:00:00
> > > 2  2018-02-01 00:10:00     41     2018-02-01 00:30:00
> > > 3  2018-02-01 00:20:00     46     2018-02-01 00:30:00
> > > 4  2018-02-01 00:30:00     39     2018-02-01 00:30:00
> > > 5  2018-02-01 00:40:00     34     2018-02-01 00:00:00
> > > 6  2018-02-01 00:50:00     32     2018-02-01 00:00:00
> > > ...
> > >
> > >
> > > Is there a way to modify this code to groupp data correctly? (I would prefer using only the base package)
> > >
> > > Thank you for your help
> > > Stefano
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
>
>
>
>
>
```