[R] readLines without skipNul=TRUE causes crash
Anthony Damico
ajdamico at gmail.com
Sun Jul 16 12:40:38 CEST 2017
hi, the text file that prompts the segfault is 4gb but only 80,937 lines
> file.info( "S:/temp/crash.txt")
size isdir mode mtime
ctime atime exe
S:/temp/crash.txt 4078192743 FALSE 666 2017-07-15 17:24:35 2017-07-15
17:19:47 2017-07-15 17:19:47 no
On Sun, Jul 16, 2017 at 6:34 AM, Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:
> On 16/07/2017 6:17 AM, Anthony Damico wrote:
>
>> thank you for taking the time to write this. i set it running last
>> night and it's still going -- if it doesn't finish by tomorrow, i will
>> try to find a site to host the problem file and add that link to the bug
>> report so the archive package can be avoided at least. i'm sorry for
>> the bother
>>
>>
> How big is that text file? I wouldn't expect my script to take more than
> a few minutes even on a huge file.
>
> My script might have a bug...
>
> Duncan Murdoch
>
> On Sat, Jul 15, 2017 at 4:14 PM, Duncan Murdoch
>> <murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>> wrote:
>>
>> On 15/07/2017 11:33 AM, Anthony Damico wrote:
>>
>> hi, i realized that the segfault happens on the text file in a
>> new R
>> session. so, creating the segfault-generating text file requires
>> a
>> contributed package, but prompting the actual segfault does not --
>> pretty sure that means this is a base R bug? submitted here:
>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311
>> <https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311>
>> hopefully i
>> am not doing something remarkably stupid. the text file itself
>> is 4GB
>> so cannot upload it to bugzilla, and from the
>> R_AllocStringBugger error
>> in the previous message, i think most or all of it needs to be
>> there to
>> trigger the segfault. thanks!
>>
>>
>> I don't want to download the big file or install the archive
>> package. Could you run the code below on the bad file? If you're
>> right and it's only nulls that matter, this might allow me to create
>> a file that triggers the bug.
>>
>> f <- # put the filename of the bad file here
>>
>> con <- file(f, open="rb")
>> zeros <- numeric()
>> repeat {
>> bytes <- readBin(con, "int", 1000000, size=1)
>> zeros <- c(zeros, count + which(bytes == 0))
>> count <- count + length(bytes)
>> if (length(bytes) < 1000000) break
>> }
>> close(con)
>> cat("File length=", count, "\n")
>> cat("Nulls:\n")
>> zeros
>>
>> Here's some code to recreate a file of the same length with nulls in
>> the same places, and spaces everywhere else:
>>
>> size <- count
>> f2 <- tempfile()
>> con <- file(f2, open="wb")
>> count <- 0
>> while (count < size) {
>> nonzeros <- min(c(size - count, 1000000, zeros - 1))
>> if (nonzeros) {
>> writeBin(rep(32L, nonzeros), con, size = 1)
>> count <- count + nonzeros
>> }
>> zeros <- zeros - nonzeros
>> if (length(zeros) && min(zeros) == 1) {
>> writeBin(0L, con, size = 1)
>> count <- count + 1
>> zeros <- zeros[-1] - 1
>> }
>> }
>> close(con)
>>
>> Duncan Murdoch
>>
>>
>>
>>
>>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list