[R] How to make the "apply" faster
Debasish Pai Mazumder
pai1981 at gmail.com
Sun Jul 10 22:38:14 CEST 2016
Thanks for your response. It is faster than before but still very slow. Any
other suggestion ?
-Deb
On Sun, Jul 10, 2016 at 2:13 PM, William Dunlap <wdunlap at tibco.com> wrote:
> There is no need to test that a logical equals TRUE: 'logicalVector==TRUE'
> is the
> same as just 'logicalVector'.
>
> There is no need to convert logical vectors to numeric, since rle() works
> on both
> types.
>
> There is no need to use length(subset(x, logicalVector)) to count how many
> elements
> in logicalVector are TRUE, just use sum(logicalVector).
>
> There is no need to make a variable, 'ans', then immediately return it.
>
> Hence your
>
> b[b == TRUE] = 1
> y <- rle(b)
> ans <- length(subset(y$lengths[y$values == 1], y$lengths[y$values ==
> 1] >= 2))
> return(ans)
>
> could be replaced by
>
> y <- rle(b)
> sum(y$lengths[y$values] >= 2)
>
> This gives some speedup, mainly for long vectors, but I find it more
> understandable.
> E.g., if f1 is your original function and f2 has the above replacement I
> get:
> > d <- -sin(1:10000+sqrt(1:4))
> > system.time(for(i in 1:10000)f1(d,.3))
> user system elapsed
> 5.19 0.00 5.19
> > system.time(for(i in 1:10000)f2(d,.3))
> user system elapsed
> 3.65 0.00 3.65
> > c(f1(d,.3), f2(d,.3))
> [1] 1492 1492
> > length(d)
> [1] 10000
>
> If it were my function, I would also get rid of the part that deals with
> the threshhold
> and direction of the inequality and tell the user to to use f(data <= 0.3)
> instead of
> f(data, .3, "below"). I would also make the spell length an argument
> instead of
> fixing it at 2. E.g.
>
> > f3 <- function (condition, spellLength = 2)
> {
> stopifnot(is.logical(condition), !anyNA(condition))
> y <- rle(condition)
> sum(y$lengths[y$values] >= spellLength)
> }
> > f3( d >= .3 )
> [1] 1492
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sun, Jul 10, 2016 at 11:58 AM, Debasish Pai Mazumder <pai1981 at gmail.com
> > wrote:
>
>> Hi Everyone,
>> Thanks for your help. It works. I have similar problem when I am
>> calculating number of spell.
>> I am also calculation spell (definition: period of two or more days where
>> x
>> exceeds 70) using similar way:
>>
>> *new = apply(x,c(1,2,4),FUN=function(y) {fun.spell.deb(y, 70)})*
>>
>> where fun.spell.deb.R:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *## Calculate spell durationfun.spell.deb <- function(data, threshold = 1,
>> direction = c("above", "below")){ #coln <- grep(weather, names(data))#
>> var <- data[,8] if(missing(direction)) {direction <- "above"}
>> if(direction=="below") {b <- (data <= threshold)} else {b <- (data >=
>> threshold)} b[b==TRUE] = 1 y <-rle(b) ans
>> <-length(subset((y$lengths[y$values==1]), (y$lengths[y$values==1])>=2))
>> return(ans)}*
>>
>> Do you have any idea how to make the "apply" faster here?
>>
>> -Deb
>>
>>
>> On Sat, Jul 9, 2016 at 3:46 PM, Charles C. Berry <ccberry at ucsd.edu>
>> wrote:
>>
>> > On Sat, 9 Jul 2016, Debasish Pai Mazumder wrote:
>> >
>> > I have 4-dimension array x(lat,lon,time,var)
>> >>
>> >> I am using "apply" to calculate over time
>> >> new = apply(x,c(1,2,4),FUN=function(y) {length(which(y>=70))})
>> >>
>> >> This is very slow. Is there anyway make it faster?
>> >>
>> >
>> > If dim(x)[3] << prod(dim(x)[-3]),
>> >
>> > new <- Reduce("+",lapply(1:dim(x)[3],function(z) x[,,z,]>=70))
>> >
>> > will be faster.
>> >
>> > However, if you can follow Peter Langfelder's suggestion to use rowSums,
>> > that would be best. Even using rowSums(aperm(x,c(1,2,4,3)>=70,dims=3)
>> and
>> > paying the price of aperm() might be better.
>> >
>> > Chuck
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list