[R] Somewhat disconcerting behavior of seq.int()
Bert Gunter
bgunter@4567 @end|ng |rom gm@||@com
Tue May 3 03:45:40 CEST 2022
** Disconcerting to me, anyway; perhaps not to others**
(Apologies if this has been discussed before. I was a bit nonplussed by
it, but maybe I'm just clueless.) Anyway:
Here are two almost identical versions of the Sieve of Eratosthenes.
The difference between them is only in the call to seq.int() that is
highlighted
sieve1 <- function(m){
if(m < 2) return(NULL)
a <- floor(sqrt(m))
pr <- Recall(a)
####################
s <- seq.int(2, to = m) ## Only difference here
######################
for( i in pr) s <- s[as.logical(s %% i)]
c(pr,s)
}
sieve2 <- function(m){
if(m < 2) return(NULL)
a <- floor(sqrt(m))
pr <- Recall(a)
####################
s <- seq.int(2, to = m, by =1) ## Only difference here
#######################
for( i in pr) s <- s[as.logical(s %% i)]
c(pr,s)
}
However, execution time is *quite* different.
library(microbenchmark)
> microbenchmark(l1 <- sieve1(1e5), times =50)
Unit: milliseconds
expr min lq mean median uq max
l1 <- sieve1(1e+05) 3.957084 3.997959 4.732045 4.01698 4.184918 7.627751
neval
50
> microbenchmark(l2 <- sieve2(1e5), times =50)
Unit: milliseconds
expr min lq mean median uq max
l2 <- sieve2(1e+05) 681.6209 682.555 683.8279 682.9368 685.2253 687.9464
neval
50
Now note that:
> identical(l1, l2)
[1] FALSE
## Because:
> str(l1)
int [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
> str(l2)
num [1:9592] 2 3 5 7 11 13 17 19 23 29 ...
I therefore assume that seq.int(), an internal generic, is dispatching
to a method that uses integer arithmetic for sieve1 and floating point
for sieve2. Is this correct? If not, what do I fail to understand? And
is this indeed the source of the large difference in execution time?
Further, ?seq.int says:
"The interpretation of the unnamed arguments of seq and seq.int is not
standard, and it is recommended always to name the arguments when
programming."
The above suggests that maybe this advice should be qualified, and/or
adding some comments to the Help file regarding this behavior might be
useful to naïfs like me.
In case it makes a difference (and it might!):
> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.3.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] microbenchmark_1.4.9
loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0
Thanks for any enlightenment and again apologies if I am plowing old ground.
Best to all,
Bert Gunter
More information about the R-help
mailing list