[R] readLines without skipNul=TRUE causes crash
Duncan Murdoch
murdoch.duncan at gmail.com
Sun Jul 16 12:34:41 CEST 2017
On 16/07/2017 6:17 AM, Anthony Damico wrote:
> thank you for taking the time to write this. i set it running last
> night and it's still going -- if it doesn't finish by tomorrow, i will
> try to find a site to host the problem file and add that link to the bug
> report so the archive package can be avoided at least. i'm sorry for
> the bother
>
How big is that text file? I wouldn't expect my script to take more
than a few minutes even on a huge file.
My script might have a bug...
Duncan Murdoch
> On Sat, Jul 15, 2017 at 4:14 PM, Duncan Murdoch
> <murdoch.duncan at gmail.com <mailto:murdoch.duncan at gmail.com>> wrote:
>
> On 15/07/2017 11:33 AM, Anthony Damico wrote:
>
> hi, i realized that the segfault happens on the text file in a new R
> session. so, creating the segfault-generating text file requires a
> contributed package, but prompting the actual segfault does not --
> pretty sure that means this is a base R bug? submitted here:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311
> <https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17311>
> hopefully i
> am not doing something remarkably stupid. the text file itself
> is 4GB
> so cannot upload it to bugzilla, and from the
> R_AllocStringBugger error
> in the previous message, i think most or all of it needs to be
> there to
> trigger the segfault. thanks!
>
>
> I don't want to download the big file or install the archive
> package. Could you run the code below on the bad file? If you're
> right and it's only nulls that matter, this might allow me to create
> a file that triggers the bug.
>
> f <- # put the filename of the bad file here
>
> con <- file(f, open="rb")
> zeros <- numeric()
> repeat {
> bytes <- readBin(con, "int", 1000000, size=1)
> zeros <- c(zeros, count + which(bytes == 0))
> count <- count + length(bytes)
> if (length(bytes) < 1000000) break
> }
> close(con)
> cat("File length=", count, "\n")
> cat("Nulls:\n")
> zeros
>
> Here's some code to recreate a file of the same length with nulls in
> the same places, and spaces everywhere else:
>
> size <- count
> f2 <- tempfile()
> con <- file(f2, open="wb")
> count <- 0
> while (count < size) {
> nonzeros <- min(c(size - count, 1000000, zeros - 1))
> if (nonzeros) {
> writeBin(rep(32L, nonzeros), con, size = 1)
> count <- count + nonzeros
> }
> zeros <- zeros - nonzeros
> if (length(zeros) && min(zeros) == 1) {
> writeBin(0L, con, size = 1)
> count <- count + 1
> zeros <- zeros[-1] - 1
> }
> }
> close(con)
>
> Duncan Murdoch
>
>
>
>
More information about the R-help
mailing list