[R] inefficient for loop, is there a better way?

Wed Dec 13 03:26:23 CET 2017

One way of doing it with data.table. It seems to scale up pretty well.
It takes 4 seconds on my computer with ts <- 1:1e6.

library(data.table)
per <- 7
elev1 <- 0.6
elev2 <- 0.85

ts <- 1:1000

examp <- data.table(ts=ts, stage=sin(ts))
examp[, `:=`(days_abv_0.6_in_last_7  = apply(do.call('cbind',
shift(stage, 1:per)), 1, function(x) sum(x > elev1)),
                   days_abv_0.85_in_last_7 = apply(do.call('cbind',
shift(stage, 1:per)), 1, function(x) sum(x > elev2)))]

On 13 December 2017 at 14:36, Morway, Eric <emorway at usgs.gov> wrote:
> The code below is a small reproducible example of a much larger problem.
> While the script below works, it is really slow on the true dataset with
> many more rows and columns.  I'm hoping to get the same result to examp,
> but with significant time savings.
>
> The example below is setting up a data.frame for an ensuing regression
> analysis.  The purpose of the script below is to appends columns to 'examp'
> that contain values corresponding to the total number of days in the
> previous 7 ('per') above some stage ('elev1' or 'elev2').  Is there a
> faster method that leverages existing R functionality?  I feel like the
> hack below is pretty clunky and can be sped up on the true dataset.  I
> would like to run a more efficient script many times adjusting the value of
> 'per'.
>
> ts <- 1:1000
> examp <- data.frame(ts=ts, stage=sin(ts))
>
> hi1 <- list()
> hi2 <- list()
> per <- 7
> elev1 <- 0.6
> elev2 <- 0.85
> for(i in per:nrow(examp)){
>     examp_per <- examp[seq(i - (per - 1), i, by=1),]
>     stg_hi_cond1 <- subset(examp_per, examp_per$stage > elev1)
>     stg_hi_cond2 <- subset(examp_per, examp_per$stage > elev2)
>
>     hi1 <- c(hi1, nrow(stg_hi_cond1))
>     hi2 <- c(hi2, nrow(stg_hi_cond2))
> }
> examp$days_abv_0.6_in_last_7   <- c(rep(NA, times=per-1), unlist(hi1))
> examp$days_abv_0.85_in_last_7  <- c(rep(NA, times=per-1), unlist(hi2))
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Yvan Richard, PhD
Environmental data scientist

Physical address: Level 4, 158 Victoria St, Te Aro, Wellington, New Zealand
Postal address: PO Box 27535, Wellington 6141, New Zealand
Phone: 022 643 7881