`runner`

an R package for running operations.Package contains standard running functions (aka. rolling) with
additional options like varying window size, lagging, handling missings
and windows depending on date. `runner`

brings also rolling
streak and rolling which, what extends beyond range of functions already
implemented in R packages. This package can be successfully used to
manipulate and aggregate time series or longitudinal data.

Install package from from GitHub or from CRAN.

```
# devtools::install_github("gogonzo/runner")
install.packages("runner")
```

`runner`

package provides functions applied on running
windows. The most universal function is `runner::runner`

which gives user possibility to apply any R function `f`

in
running window. In example below 4-months correlation is calculated
lagged by 1 month.

```
library(runner)
<- data.frame(
x date = seq.Date(Sys.Date(), Sys.Date() + 365, length.out = 20),
a = rnorm(20),
b = rnorm(20)
)
runner(
x, lag = "1 months",
k = "4 months",
idx = x$date,
f = function(x) {
cor(x$a, x$b)
} )
```

There are different kinds of running windows and all of them are
implemented in `runner`

.

Following diagram illustrates what running windows are - in this case
running windows of length `k = 4`

. For each of 15 elements of
a vector each window contains current 4 elements.

`k`

denotes number of elements in window. If
`k`

is a single value then window size is constant for all
elements of x. For varying window size one should specify `k`

as integer vector of `length(k) == length(x)`

where each
element of `k`

defines window length. If `k`

is
empty it means that window will be cumulative (like
`base::cumsum`

). Example below illustrates window of
`k = 4`

for 10th element of vector `x`

.

`runner(1:15, k = 4)`

`lag`

denotes how many observations windows will be lagged
by. If `lag`

is a single value than it is constant for all
elements of x. For varying lag size one should specify `lag`

as integer vector of `length(lag) == length(x)`

where each
element of `lag`

defines lag of window. Default value of
`lag = 0`

. Example below illustrates window of
`k = 4`

lagged by `lag = 2`

for 10-th element of
vector `x`

. Lag can also be negative value, which shifts
window forward instead of backward.

```
runner(
1:15,
k = 4,
lag = 2
)
```

Sometimes data points in dataset are not equally spaced (missing
weekends, holidays, other missings) and thus window size should vary to
keep expected time frame. If one specifies `idx`

argument,
than running functions are applied on windows depending on date.
`idx`

should be the same length as `x`

of class
`Date`

or `integer`

. Including `idx`

can be combined with varying window size, than k will denote number of
periods in window different for each data point. Example below
illustrates window of size `k = 5`

lagged by
`lag = 2`

. In parentheses ranges for each window.

```
<- Sys.Date() + c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
idx runner(
x = 1:15,
k = "5 days",
lag = "1 days",
idx = idx
)
```

Runner by default returns vector of the same size as `x`

unless one puts any-size vector to `at`

argument. Each
element of `at`

is an index on which runner calculates
function. Below illustrates output of runner for
`at = c(18, 27, 45, 31)`

which gives windows in ranges
enclosed in square brackets. Range for `at = 27`

is
`[22, 26]`

which is not available in current indices.

```
<- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
idx runner(
x = idx,
k = 5,
lag = 1,
idx = idx,
at = c(18, 27, 48, 31)
)
```

`NA`

paddingUsing `runner`

one can also specify
`na_pad = TRUE`

which would return `NA`

for any
window which is partially out of range - meaning that there is no
sufficient number of observations to fill the window. By default
`na_pad = FALSE`

, which means that incomplete windows are
calculated anyway. `na_pad`

is applied on normal cumulative
windows and on windows depending on date. In example below two windows
exceed range given by `idx`

so for these windows are empty
for `na_pad = TRUE`

. If used sets `na_pad = FALSE`

first window will be empty (no single element within
`[-2, 3]`

) and last window will return elements within
matching `idx`

.

```
<- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
idx runner(
x = idx,
k = 5,
lag = 1,
idx = idx,
at = c(4, 18, 48, 51),
na_pad = TRUE
)
```

`data.frame`

User can also put `data.frame`

into `x`

argument and apply functions which involve multiple columns. In example
below we calculate beta parameter of `lm`

model on 1, 2, …, n
observations respectively. On the plot one can observe how
`lm`

parameter adapt with increasing number of
observation.

```
<- Sys.Date() + cumsum(sample(1:3, 40, replace = TRUE)) # unequaly spaced time series
date <- cumsum(rnorm(40))
x <- 30 * x + rnorm(40)
y
<- data.frame(date, y, x)
df
<- runner(
slope
df,k = 10,
idx = "date",
function(x) {
coefficients(lm(y ~ x, data = x))[2]
}
)
plot(slope)
abline(h = 30, col = "blue")
```

The `runner`

function can also compute windows in parallel
mode. The function doesn’t initialize the parallel cluster automatically
but one have to do this outside and pass it to the `runner`

through `cl`

argument.

```
library(parallel)
#
<- detectCores()
numCores <- makeForkCluster(numCores)
cl
runner(
x = df,
k = 10,
idx = "date",
f = function(x) sum(x$x),
cl = cl
)
stopCluster(cl)
```

With `runner`

one can use any R functions, but some of
them are optimized for speed reasons. These functions are:

- aggregating functions - `length_run`

, `min_run`

,
`max_run`

, `minmax_run`

, `sum_run`

,
`mean_run`

, `streak_run`

- utility functions - `fill_run`

, `lag_run`

,
`which_run`