[R] readLines without skipNul=TRUE causes crash

Anthony Damico ajdamico at gmail.com
Sun Jul 16 12:17:19 CEST 2017


thank you for taking the time to write this.  i set it running last night
and it's still going -- if it doesn't finish by tomorrow, i will try to
find a site to host the problem file and add that link to the bug report so
the archive package can be avoided at least.  i'm sorry for the bother

On Sat, Jul 15, 2017 at 4:14 PM, Duncan Murdoch <murdoch.duncan at gmail.com>
wrote:

> On 15/07/2017 11:33 AM, Anthony Damico wrote:
>
>> hi, i realized that the segfault happens on the text file in a new R
>> session.  so, creating the segfault-generating text file requires a
>> contributed package, but prompting the actual segfault does not --
>> pretty sure that means this is a base R bug?  submitted here:
>> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311  hopefully i
>> am not doing something remarkably stupid.  the text file itself is 4GB
>> so cannot upload it to bugzilla, and from the R_AllocStringBugger error
>> in the previous message, i think most or all of it needs to be there to
>> trigger the segfault.  thanks!
>>
>
> I don't want to download the big file or install the archive package.
> Could you run the code below on the bad file?  If you're right and it's
> only nulls that matter, this might allow me to create a file that triggers
> the bug.
>
> f <-  # put the filename of the bad file here
>
> con <- file(f, open="rb")
> zeros <- numeric()
> repeat {
>   bytes <- readBin(con, "int", 1000000, size=1)
>   zeros <- c(zeros, count + which(bytes == 0))
>   count <- count + length(bytes)
>   if (length(bytes) < 1000000) break
> }
> close(con)
> cat("File length=", count, "\n")
> cat("Nulls:\n")
> zeros
>
> Here's some code to recreate a file of the same length with nulls in the
> same places, and spaces everywhere else:
>
> size <- count
> f2 <- tempfile()
> con <- file(f2, open="wb")
> count <- 0
> while (count < size) {
>   nonzeros <- min(c(size - count, 1000000, zeros - 1))
>   if (nonzeros) {
>     writeBin(rep(32L, nonzeros), con, size = 1)
>     count <- count + nonzeros
>   }
>   zeros <- zeros - nonzeros
>   if (length(zeros) && min(zeros) == 1) {
>     writeBin(0L, con, size = 1)
>     count <- count + 1
>     zeros <- zeros[-1] - 1
>   }
> }
> close(con)
>
> Duncan Murdoch
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list