[R] [External] Somewhat disconcerting behavior of seq.int()

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Tue May 3 05:52:56 CEST 2022


Something is very different about your system. On my Linux system I get

> microbenchmark(l1 <- sieve1(1e5), times =50)
Unit: milliseconds
                 expr     min       lq     mean   median       uq     max neval
  l1 <- sieve1(1e+05) 5.04615 5.350576 6.967507 5.787626 7.323502 28.3085    50
> microbenchmark(l2 <- sieve2(1e5), times =50)
Unit: milliseconds
                 expr      min       lq     mean   median      uq      max neval
  l2 <- sieve2(1e+05) 14.58763 15.79368 17.00738 16.29299 17.0723 30.57338    50

Similar on an Intel Mac.

Best,

luke

On Tue, 3 May 2022, Bert Gunter wrote:

> ** Disconcerting to me, anyway; perhaps not to others**
> (Apologies if this has been discussed before. I was a bit nonplussed by
> it, but maybe I'm just clueless.) Anyway:
>
> Here are two almost identical versions of the Sieve of Eratosthenes.
> The difference between them is only in the call to seq.int() that is
> highlighted
>
> sieve1 <- function(m){
>   if(m < 2) return(NULL)
>   a <- floor(sqrt(m))
>   pr <- Recall(a)
> ####################
>   s <- seq.int(2, to = m) ## Only difference here
> ######################
>   for( i in pr) s <- s[as.logical(s %% i)]
>   c(pr,s)
> }
>
> sieve2 <- function(m){
>   if(m < 2) return(NULL)
>   a <- floor(sqrt(m))
>   pr <- Recall(a)
> ####################
>   s <- seq.int(2, to = m, by =1) ## Only difference here
> #######################
>   for( i in pr) s <- s[as.logical(s %% i)]
>   c(pr,s)
> }
>
> However, execution time is *quite* different.
>
> library(microbenchmark)
>
>> microbenchmark(l1 <- sieve1(1e5), times =50)
> Unit: milliseconds
>                expr      min       lq     mean  median       uq      max
> l1 <- sieve1(1e+05) 3.957084 3.997959 4.732045 4.01698 4.184918 7.627751
> neval
>    50
>
>> microbenchmark(l2 <- sieve2(1e5), times =50)
> Unit: milliseconds
>                expr      min      lq     mean   median       uq      max
> l2 <- sieve2(1e+05) 681.6209 682.555 683.8279 682.9368 685.2253 687.9464
> neval
>    50
>
> Now note that:
>> identical(l1, l2)
> [1] FALSE
>
> ## Because:
>> str(l1)
> int [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
>
>> str(l2)
> num [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
>
> I therefore assume that seq.int(), an internal generic, is dispatching
> to a method that uses integer arithmetic for sieve1 and floating point
> for sieve2. Is this correct? If not, what do I fail to understand? And
> is this indeed the source of the large difference in execution time?
>
> Further, ?seq.int says:
> "The interpretation of the unnamed arguments of seq and seq.int is not
> standard, and it is recommended always to name the arguments when
> programming."
>
> The above suggests that maybe this advice should be qualified, and/or
> adding some comments to the Help file regarding this behavior might be
> useful to naïfs like me.
>
> In case it makes a difference (and it might!):
>
>> sessionInfo()
> R version 4.2.0 (2022-04-22)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Monterey 12.3.1
>
> Matrix products: default
> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] microbenchmark_1.4.9
>
> loaded via a namespace (and not attached):
> [1] compiler_4.2.0 tools_4.2.0
>
>
> Thanks for any enlightenment and again apologies if I am plowing old ground.
>
> Best to all,
>
> Bert Gunter
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-help mailing list