[R] [External] Somewhat disconcerting behavior of seq.int()

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Tue May 3 17:23:42 CEST 2022


I resolved the problem by reinstalling R. See below. No clue as to
what may have been the cause (just an ignorant wild guess that is not
worth sharing).

Thanks again to all for your help.
Bert

> s1 <- seq.int(2, 1e5, by =1)
> s2 = as.integer(s1)
>
> microbenchmark( v1 <- s1 %% 2, times = 50)
Unit: microseconds
        expr     min      lq     mean   median      uq     max neval
 v1 <- s1%%2 396.839 410.943 433.6234 432.3245 457.068 491.057    50
> microbenchmark( v2 <- s2 %% 2L, times = 50)
Unit: microseconds
         expr     min      lq     mean  median      uq    max neval
 v2 <- s2%%2L 145.837 150.019 159.5441 162.032 164.943 177.12    50


sieve1 <- function(m){
   if(m < 2) return(NULL)
   a <- floor(sqrt(m))
   pr <- Recall(a)
   s <- seq.int(2, to = m) ## Only difference here
   for( i in pr) s <- s[as.logical(s %% i)]
   c(pr,s)
}

sieve2 <- function(m){
   if(m < 2) return(NULL)
   a <- floor(sqrt(m))
   pr <- Recall(a)
   s <-seq.int(2L, to = m, by =1) ## Only difference here
   for( i in pr) s <- s[as.logical(s %% i)]
   c(pr,s)
}

> microbenchmark(l1 <- sieve1(1e5), times =50)
Unit: milliseconds
                expr     min       lq     mean  median       uq      max
 l1 <- sieve1(1e+05) 3.69533 4.068307 5.679122 4.28327 7.561425 10.07493
 neval
    50
> microbenchmark(l2 <- sieve2(1e5), times =50)
Unit: milliseconds
                expr      min       lq     mean   median       uq
 l2 <- sieve2(1e+05) 5.367679 6.128229 8.013111 8.940788 9.430246
      max neval
 11.52822    50

On Mon, May 2, 2022 at 8:53 PM <luke-tierney using uiowa.edu> wrote:
>
> Something is very different about your system. On my Linux system I get
>
> > microbenchmark(l1 <- sieve1(1e5), times =50)
> Unit: milliseconds
>                  expr     min       lq     mean   median       uq     max neval
>   l1 <- sieve1(1e+05) 5.04615 5.350576 6.967507 5.787626 7.323502 28.3085    50
> > microbenchmark(l2 <- sieve2(1e5), times =50)
> Unit: milliseconds
>                  expr      min       lq     mean   median      uq      max neval
>   l2 <- sieve2(1e+05) 14.58763 15.79368 17.00738 16.29299 17.0723 30.57338    50
>
> Similar on an Intel Mac.
>
> Best,
>
> luke
>
> On Tue, 3 May 2022, Bert Gunter wrote:
>
> > ** Disconcerting to me, anyway; perhaps not to others**
> > (Apologies if this has been discussed before. I was a bit nonplussed by
> > it, but maybe I'm just clueless.) Anyway:
> >
> > Here are two almost identical versions of the Sieve of Eratosthenes.
> > The difference between them is only in the call to seq.int() that is
> > highlighted
> >
> > sieve1 <- function(m){
> >   if(m < 2) return(NULL)
> >   a <- floor(sqrt(m))
> >   pr <- Recall(a)
> > ####################
> >   s <- seq.int(2, to = m) ## Only difference here
> > ######################
> >   for( i in pr) s <- s[as.logical(s %% i)]
> >   c(pr,s)
> > }
> >
> > sieve2 <- function(m){
> >   if(m < 2) return(NULL)
> >   a <- floor(sqrt(m))
> >   pr <- Recall(a)
> > ####################
> >   s <- seq.int(2, to = m, by =1) ## Only difference here
> > #######################
> >   for( i in pr) s <- s[as.logical(s %% i)]
> >   c(pr,s)
> > }
> >
> > However, execution time is *quite* different.
> >
> > library(microbenchmark)
> >
> >> microbenchmark(l1 <- sieve1(1e5), times =50)
> > Unit: milliseconds
> >                expr      min       lq     mean  median       uq      max
> > l1 <- sieve1(1e+05) 3.957084 3.997959 4.732045 4.01698 4.184918 7.627751
> > neval
> >    50
> >
> >> microbenchmark(l2 <- sieve2(1e5), times =50)
> > Unit: milliseconds
> >                expr      min      lq     mean   median       uq      max
> > l2 <- sieve2(1e+05) 681.6209 682.555 683.8279 682.9368 685.2253 687.9464
> > neval
> >    50
> >
> > Now note that:
> >> identical(l1, l2)
> > [1] FALSE
> >
> > ## Because:
> >> str(l1)
> > int [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
> >
> >> str(l2)
> > num [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
> >
> > I therefore assume that seq.int(), an internal generic, is dispatching
> > to a method that uses integer arithmetic for sieve1 and floating point
> > for sieve2. Is this correct? If not, what do I fail to understand? And
> > is this indeed the source of the large difference in execution time?
> >
> > Further, ?seq.int says:
> > "The interpretation of the unnamed arguments of seq and seq.int is not
> > standard, and it is recommended always to name the arguments when
> > programming."
> >
> > The above suggests that maybe this advice should be qualified, and/or
> > adding some comments to the Help file regarding this behavior might be
> > useful to naïfs like me.
> >
> > In case it makes a difference (and it might!):
> >
> >> sessionInfo()
> > R version 4.2.0 (2022-04-22)
> > Platform: x86_64-apple-darwin17.0 (64-bit)
> > Running under: macOS Monterey 12.3.1
> >
> > Matrix products: default
> > LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > other attached packages:
> > [1] microbenchmark_1.4.9
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_4.2.0 tools_4.2.0
> >
> >
> > Thanks for any enlightenment and again apologies if I am plowing old ground.
> >
> > Best to all,
> >
> > Bert Gunter
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>     Actuarial Science
> 241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-help mailing list