[R] readLines without skipNul=TRUE causes crash
Duncan Murdoch
murdoch.duncan at gmail.com
Sat Jul 15 22:14:58 CEST 2017
On 15/07/2017 11:33 AM, Anthony Damico wrote:
> hi, i realized that the segfault happens on the text file in a new R
> session. so, creating the segfault-generating text file requires a
> contributed package, but prompting the actual segfault does not --
> pretty sure that means this is a base R bug? submitted here:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311 hopefully i
> am not doing something remarkably stupid. the text file itself is 4GB
> so cannot upload it to bugzilla, and from the R_AllocStringBugger error
> in the previous message, i think most or all of it needs to be there to
> trigger the segfault. thanks!
I don't want to download the big file or install the archive package.
Could you run the code below on the bad file? If you're right and it's
only nulls that matter, this might allow me to create a file that
triggers the bug.
f <- # put the filename of the bad file here
con <- file(f, open="rb")
zeros <- numeric()
repeat {
bytes <- readBin(con, "int", 1000000, size=1)
zeros <- c(zeros, count + which(bytes == 0))
count <- count + length(bytes)
if (length(bytes) < 1000000) break
}
close(con)
cat("File length=", count, "\n")
cat("Nulls:\n")
zeros
Here's some code to recreate a file of the same length with nulls in the
same places, and spaces everywhere else:
size <- count
f2 <- tempfile()
con <- file(f2, open="wb")
count <- 0
while (count < size) {
nonzeros <- min(c(size - count, 1000000, zeros - 1))
if (nonzeros) {
writeBin(rep(32L, nonzeros), con, size = 1)
count <- count + nonzeros
}
zeros <- zeros - nonzeros
if (length(zeros) && min(zeros) == 1) {
writeBin(0L, con, size = 1)
count <- count + 1
zeros <- zeros[-1] - 1
}
}
close(con)
Duncan Murdoch
More information about the R-help
mailing list