[R] readLines without skipNul=TRUE causes crash

Duncan Murdoch murdoch.duncan at gmail.com
Sat Jul 15 22:14:58 CEST 2017


On 15/07/2017 11:33 AM, Anthony Damico wrote:
> hi, i realized that the segfault happens on the text file in a new R
> session.  so, creating the segfault-generating text file requires a
> contributed package, but prompting the actual segfault does not --
> pretty sure that means this is a base R bug?  submitted here:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311  hopefully i
> am not doing something remarkably stupid.  the text file itself is 4GB
> so cannot upload it to bugzilla, and from the R_AllocStringBugger error
> in the previous message, i think most or all of it needs to be there to
> trigger the segfault.  thanks!

I don't want to download the big file or install the archive package. 
Could you run the code below on the bad file?  If you're right and it's 
only nulls that matter, this might allow me to create a file that 
triggers the bug.

f <-  # put the filename of the bad file here

con <- file(f, open="rb")
zeros <- numeric()
repeat {
   bytes <- readBin(con, "int", 1000000, size=1)
   zeros <- c(zeros, count + which(bytes == 0))
   count <- count + length(bytes)
   if (length(bytes) < 1000000) break
}
close(con)
cat("File length=", count, "\n")
cat("Nulls:\n")
zeros

Here's some code to recreate a file of the same length with nulls in the 
same places, and spaces everywhere else:

size <- count
f2 <- tempfile()
con <- file(f2, open="wb")
count <- 0
while (count < size) {
   nonzeros <- min(c(size - count, 1000000, zeros - 1))
   if (nonzeros) {
     writeBin(rep(32L, nonzeros), con, size = 1)
     count <- count + nonzeros
   }
   zeros <- zeros - nonzeros
   if (length(zeros) && min(zeros) == 1) {
     writeBin(0L, con, size = 1)
     count <- count + 1
     zeros <- zeros[-1] - 1
   }
}
close(con)

Duncan Murdoch



More information about the R-help mailing list