[Rd] scan(..., skip=1e11): infinite loop; cannot interrupt

Spencer Graves @pencer@gr@ve@ @end|ng |rom prod@y@e@com
Sat Feb 11 06:38:55 CET 2023


Hello, All:


	  I have a 4.54 GB file that I'm trying to read in chunks using 
"scan(..., skip=__)".  It works as expected for small values of "skip" 
but goes into an infinite loop for "skip=1e11" and similar large values 
of skip:  I cannot even interrupt it;  I must kill R.  Below please find 
sessionInfo() with a toy example.


	  My real problem is a large corrupted Thunderbird email file.  It's 
file type "Mork", which is mostly standard characters with "\n" between 
records of varying length.


	  Is there some other function in R that allows me to read chunks of a 
large file like this?


	  Thanks,
	  Spencer Graves


writeLines(as.character(1:11), 'tstNums.txt')
(Tst2 <- scan('tstNums.txt', n=12, skip=5))
# works: 6 7 8 9 10 11
(Tst13 <- scan('tstNums.txt', n=12, skip=13))
# works: numeric(0)
(tst1e11 <- scan('tst.txt', n=12, skip=1e11))
# Goes into an infinite loop that I cannot even interrupt.
# I must kill R and start over.


sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.7.3

Matrix products: default
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
  [1] compiler_4.2.2  fastmap_1.1.0   cli_3.6.0       htmltools_0.5.4
  [5] tools_4.2.2     rstudioapi_0.14 yaml_2.3.6      rmarkdown_2.20
  [9] knitr_1.41      xfun_0.36       digest_0.6.31   rlang_1.0.6
[13] evaluate_0.20



More information about the R-devel mailing list