[R] NAs produced by integer overflow, but only some time ...
Stefan Th. Gries
@tgrie@ @ending from gm@il@com
Wed May 9 04:54:26 CEST 2018
I have problem with integer overflow that I cannot understand.
I have a character vector curr.lemmas with the following properties:
length(curr.lemmas) # 61224
length(unique(curr.lemmas)) # 2652
That vector is the input to the following function:
yules.k1 <- function(input) {
m1 <- length(input); temp <- table(table(input))
m2 <- sum("*"(temp, as.numeric(names(temp))^2))
return(10000*(m2-m1) / (m1*m1))
}
When I run this, I get the following output:
[1] NA
Warning message:
In m1 * m1 : NAs produced by integer overflow
But when I change the function to this one by just replacing m1*m1 by m1^2 ...
yules.k2 <- function(input) {
m1 <- length(input); temp <- table(table(input))
m2 <- sum("*"(temp, as.numeric(names(temp))^2))
return(10000*(m2-m1) / (m1^2))
}
yules.k2(curr.lemmas) # -> 157.261
I am using RStudio 1.1.447 and here's my sessionInfo
######################
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 18.3
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.4 backports_1.1.2 magrittr_1.5 rprojroot_1.3-2
htmltools_0.3.6 tools_3.4.4 yaml_2.1.19 Rcpp_0.12.16
stringi_1.2.2
[10] rmarkdown_1.9 knitr_1.20 stringr_1.3.0 digest_0.6.15
evaluate_0.10.1
######################
What is even more puzzling is that one time I ran R in the console of
Geany and this happened:
> m1
[1] 61224
> 61224*61224
[1] 3748378176
> 61224^2
[1] 3748378176
> m1*m1
[1] NA
Warning message:
In m1 * m1 : NAs produced by integer overflow
> m1^2
[1] 3748378176
That is, the multiplication worked with the numbers but not the
numeric vectors; the above is literally copied from the console. Why
is that happening?
Any help would be much appreciated!
STG
--
Stefan Th. Gries
----------------------------------
Univ. of California, Santa Barbara
http://tinyurl.com/stgries
More information about the R-help
mailing list