[R] [External] Somewhat disconcerting behavior of seq.int()
Bert Gunter
bgunter@4567 @end|ng |rom gm@||@com
Tue May 3 17:23:42 CEST 2022
I resolved the problem by reinstalling R. See below. No clue as to
what may have been the cause (just an ignorant wild guess that is not
worth sharing).
Thanks again to all for your help.
Bert
> s1 <- seq.int(2, 1e5, by =1)
> s2 = as.integer(s1)
>
> microbenchmark( v1 <- s1 %% 2, times = 50)
Unit: microseconds
expr min lq mean median uq max neval
v1 <- s1%%2 396.839 410.943 433.6234 432.3245 457.068 491.057 50
> microbenchmark( v2 <- s2 %% 2L, times = 50)
Unit: microseconds
expr min lq mean median uq max neval
v2 <- s2%%2L 145.837 150.019 159.5441 162.032 164.943 177.12 50
sieve1 <- function(m){
if(m < 2) return(NULL)
a <- floor(sqrt(m))
pr <- Recall(a)
s <- seq.int(2, to = m) ## Only difference here
for( i in pr) s <- s[as.logical(s %% i)]
c(pr,s)
}
sieve2 <- function(m){
if(m < 2) return(NULL)
a <- floor(sqrt(m))
pr <- Recall(a)
s <-seq.int(2L, to = m, by =1) ## Only difference here
for( i in pr) s <- s[as.logical(s %% i)]
c(pr,s)
}
> microbenchmark(l1 <- sieve1(1e5), times =50)
Unit: milliseconds
expr min lq mean median uq max
l1 <- sieve1(1e+05) 3.69533 4.068307 5.679122 4.28327 7.561425 10.07493
neval
50
> microbenchmark(l2 <- sieve2(1e5), times =50)
Unit: milliseconds
expr min lq mean median uq
l2 <- sieve2(1e+05) 5.367679 6.128229 8.013111 8.940788 9.430246
max neval
11.52822 50
On Mon, May 2, 2022 at 8:53 PM <luke-tierney using uiowa.edu> wrote:
>
> Something is very different about your system. On my Linux system I get
>
> > microbenchmark(l1 <- sieve1(1e5), times =50)
> Unit: milliseconds
> expr min lq mean median uq max neval
> l1 <- sieve1(1e+05) 5.04615 5.350576 6.967507 5.787626 7.323502 28.3085 50
> > microbenchmark(l2 <- sieve2(1e5), times =50)
> Unit: milliseconds
> expr min lq mean median uq max neval
> l2 <- sieve2(1e+05) 14.58763 15.79368 17.00738 16.29299 17.0723 30.57338 50
>
> Similar on an Intel Mac.
>
> Best,
>
> luke
>
> On Tue, 3 May 2022, Bert Gunter wrote:
>
> > ** Disconcerting to me, anyway; perhaps not to others**
> > (Apologies if this has been discussed before. I was a bit nonplussed by
> > it, but maybe I'm just clueless.) Anyway:
> >
> > Here are two almost identical versions of the Sieve of Eratosthenes.
> > The difference between them is only in the call to seq.int() that is
> > highlighted
> >
> > sieve1 <- function(m){
> > if(m < 2) return(NULL)
> > a <- floor(sqrt(m))
> > pr <- Recall(a)
> > ####################
> > s <- seq.int(2, to = m) ## Only difference here
> > ######################
> > for( i in pr) s <- s[as.logical(s %% i)]
> > c(pr,s)
> > }
> >
> > sieve2 <- function(m){
> > if(m < 2) return(NULL)
> > a <- floor(sqrt(m))
> > pr <- Recall(a)
> > ####################
> > s <- seq.int(2, to = m, by =1) ## Only difference here
> > #######################
> > for( i in pr) s <- s[as.logical(s %% i)]
> > c(pr,s)
> > }
> >
> > However, execution time is *quite* different.
> >
> > library(microbenchmark)
> >
> >> microbenchmark(l1 <- sieve1(1e5), times =50)
> > Unit: milliseconds
> > expr min lq mean median uq max
> > l1 <- sieve1(1e+05) 3.957084 3.997959 4.732045 4.01698 4.184918 7.627751
> > neval
> > 50
> >
> >> microbenchmark(l2 <- sieve2(1e5), times =50)
> > Unit: milliseconds
> > expr min lq mean median uq max
> > l2 <- sieve2(1e+05) 681.6209 682.555 683.8279 682.9368 685.2253 687.9464
> > neval
> > 50
> >
> > Now note that:
> >> identical(l1, l2)
> > [1] FALSE
> >
> > ## Because:
> >> str(l1)
> > int [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
> >
> >> str(l2)
> > num [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
> >
> > I therefore assume that seq.int(), an internal generic, is dispatching
> > to a method that uses integer arithmetic for sieve1 and floating point
> > for sieve2. Is this correct? If not, what do I fail to understand? And
> > is this indeed the source of the large difference in execution time?
> >
> > Further, ?seq.int says:
> > "The interpretation of the unnamed arguments of seq and seq.int is not
> > standard, and it is recommended always to name the arguments when
> > programming."
> >
> > The above suggests that maybe this advice should be qualified, and/or
> > adding some comments to the Help file regarding this behavior might be
> > useful to naïfs like me.
> >
> > In case it makes a difference (and it might!):
> >
> >> sessionInfo()
> > R version 4.2.0 (2022-04-22)
> > Platform: x86_64-apple-darwin17.0 (64-bit)
> > Running under: macOS Monterey 12.3.1
> >
> > Matrix products: default
> > LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats graphics grDevices utils datasets methods base
> >
> > other attached packages:
> > [1] microbenchmark_1.4.9
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_4.2.0 tools_4.2.0
> >
> >
> > Thanks for any enlightenment and again apologies if I am plowing old ground.
> >
> > Best to all,
> >
> > Bert Gunter
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa Phone: 319-335-3386
> Department of Statistics and Fax: 319-335-3017
> Actuarial Science
> 241 Schaeffer Hall email: luke-tierney using uiowa.edu
> Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
More information about the R-help
mailing list