I have problem with integer overflow that I cannot understand. I have a character vector curr.lemmas with the following properties:
length(curr.lemmas) # 61224 length(unique(curr.lemmas)) # 2652 That vector is the input to the following function: yules.k1 <- function(input) { m1 <- length(input); temp <- table(table(input)) m2 <- sum("*"(temp, as.numeric(names(temp))^2)) return(10000*(m2-m1) / (m1*m1)) } When I run this, I get the following output: [1] NA Warning message: In m1 * m1 : NAs produced by integer overflow But when I change the function to this one by just replacing m1*m1 by m1^2 ... yules.k2 <- function(input) { m1 <- length(input); temp <- table(table(input)) m2 <- sum("*"(temp, as.numeric(names(temp))^2)) return(10000*(m2-m1) / (m1^2)) } yules.k2(curr.lemmas) # -> 157.261 I am using RStudio 1.1.447 and here's my sessionInfo ###################### R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Linux Mint 18.3 Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/libopenblasp-r0.2.18.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_3.4.4 backports_1.1.2 magrittr_1.5 rprojroot_1.3-2 htmltools_0.3.6 tools_3.4.4 yaml_2.1.19 Rcpp_0.12.16 stringi_1.2.2 [10] rmarkdown_1.9 knitr_1.20 stringr_1.3.0 digest_0.6.15 evaluate_0.10.1 ###################### What is even more puzzling is that one time I ran R in the console of Geany and this happened: > m1 [1] 61224 > 61224*61224 [1] 3748378176 > 61224^2 [1] 3748378176 > m1*m1 [1] NA Warning message: In m1 * m1 : NAs produced by integer overflow > m1^2 [1] 3748378176 That is, the multiplication worked with the numbers but not the numeric vectors; the above is literally copied from the console. Why is that happening? Any help would be much appreciated! STG -- Stefan Th. Gries ---------------------------------- Univ. of California, Santa Barbara http://tinyurl.com/stgries ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.