[Rd] Very hard to reproduce bug (?) in R-devel

Winston Chang winstonchang1 at gmail.com
Wed Apr 5 23:30:34 CEST 2017


I just tried recompiling R with no -O flag, and I still get the same error.
Here are the CFLAGS (the RD program runs R-devel instead of R 3.3.3):

# RD CMD config CFLAGS
-g -fdebug-prefix-map=/build/r-base-3.3.3=. -fstack-protector-strong
-Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g

-Winston


On Wed, Apr 5, 2017 at 2:24 PM, Robert McGehee <rmcgehee at walleyetrading.net>
wrote:

> Winston,
> I had a similar experience to you tracking down an insanely difficult bug
> in my R code that "disappeared" whenever slight changes were made to the
> script (e.g. like adding cat() statements). In my case, it coincided with
> my over-eager compilation of R and its library stack, as I was also
> experimenting with a cutting edge version of the gcc compiler as well as
> what I thought were innocuous performance enhancing CFLAGS like -O3/-Ofast
> -march=native. After downgrading gcc and recompiling everything (R and
> BLAS) without the extra flags, the problem went away. Not saying this is
> your problem, just sharing my similar experience.
>
> <TANGENT> And for anyone interested, I did extensive benchmarking on the
> effects of the added CFLAGS and cutting edge gcc compilers, and I never saw
> any significant performance enhancement, and frequently saw a big
> performance penalty with extra flags, particularly as matrix algebra
> problems sometimes slowed down enormously when using a custom BLAS (ATLAS)
> compiled with anything fancy. Though nowadays, the out-of-the-box R BLAS
> seems much faster than it used to be, so I don't even bother fiddling with
> a custom BLAS. </TANGENT>
>
> --Robert
>
>
> -----Original Message-----
> From: R-devel [mailto:r-devel-bounces at r-project.org] On Behalf Of Winston
> Chang
> Sent: Wednesday, April 05, 2017 2:41 PM
> To: Martin Maechler <maechler at stat.math.ethz.ch>
> Cc: R Development <R-devel at r-project.org>
> Subject: Re: [Rd] Very hard to reproduce bug (?) in R-devel
>
> )
>
> On Wed, Apr 5, 2017 at 2:59 AM, Martin Maechler
> <maechler at stat.math.ethz.ch> wrote:
> >
> > >>>>> Winston Chang <winstonchang1 at gmail.com>
> > >>>>>     on Tue, 4 Apr 2017 15:29:40 -0500 writes:
> >
> >     > I've done some more investigation into the problem, and it is very
> >     > difficult to pin down. What it looks like is happening is roughly
> like this:
> >     > - `p` is an environment and `p$e` is also an environment.
> >     > - There is a loop. In each iteration, it looks for one item in
> `p$e`, saves
> >     > it in a variable `x`, then removes that item from `p$e`. Then it
> invokes
> >     > `x()`. The loop runs again, until there are no more items in `p$e`.
> >
> >     > The problem is that `ls(p$e)` sometimes returns the wrong values
> -- it
> >     > returns the values that it had in previous iterations of the loop.
> The
> >     > behavior is very touchy. Almost any change to the code will
> slightly change
> >     > the behavior; sometimes the `ls()` returns values from a different
> >     > iteration of the loop, and sometimes the problem doesn't happen at
> all.
> >
> >     > I've put a  Dockerfile and instructions for reproducing the
> problem here:
> >     > https://gist.github.com/wch/2596a1c9f1bcdee91bb210c782141c88
> >
> >     > I think that I've gotten about as far with this as I can, though
> I'd be
> >     > happy to provide more information if anyone wants to take look at
> the
> >     > problem.
> >
> > Dear Winston,
> >
> > While I agree this may very well be a bug in R(-devel), and hence
> > also R in 3.4.0 alpha and hence quite important to be dealt with,
> >
> > your code still involves 3 non-trivial  packages (DBI, R6,
> > testthat) some of which have their own C code and notably load
> > a couple of other package's namespaces.
> > We've always made a point
> >       https://www.r-project.org/bugs.html
> > that bugs in R should be reproducible without extra
> > packages... and I think it would definitely help to pinpoint the
> > issue to be seen outside of your extra packages' world.
> >
> > Or have you been aware of that and are just asking for help
> > finding a bug in one of the extra packages involved, a bug that might
> only be triggered by recent changes in R ?
> >
> > OTOH, what you describe above  (p ; p$e ; p$e$x ...)
> > should be reproducible in pure "base" R code, right?
> >
> > I'm sorry not to be of more help
> > Martin
>
>
> Of the four packages that are loaded when running the tests (pool,
> DBI, R6, testthat, magrittr, crayon), only testthat contains compiled
> code, and it is pretty minimal. The only compiled code in testthat
> that should be executed is a function that finds a label -- but that
> happens only after an error occurs.
>
> This is the sessionInfo():
> R Under development (unstable) (2017-03-23 r72389)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 9 (stretch)
>
> Matrix products: default
> BLAS: /usr/local/lib/R/lib/libRblas.so
> LAPACK: /usr/local/lib/R/lib/libRlapack.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] pool_0.1.0     DBI_0.6-1      testthat_1.0.2
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.0 magrittr_1.5   R6_2.2.0       crayon_1.3.2
>
>
> I have spent days trying to make a minimal example, and that is the
> best that I have been able to do so far. I would not involve all this
> complexity if I could avoid it. The problem is that the behavior is
> extremely sensitive to any changes. Seemingly-innocuous things like
> removing other tests, or adding a cat() statement can make the error
> disappear, or in some cases it makes ls() return values from a
> different (previous) iteration of the loop.
>
>
> In my testing, I have also seen a case where calls to `cat( ... ,
> file=stderr())` result in output that has the wrong order. I don't
> know if that's due to the cat() calls being executed in the wrong
> order, or if it's simply being printed or buffered in the wrong order.
>
> This is the code in question that cat()s to stderr:
> https://github.com/rstudio/pool/blob/0724ad9/R/scheduler.R#L74-L90
>       while (TRUE) {
>         tasks <- sort(ls(private$scheduledTasks))
>         if (length(tasks) == 0) break
>         t <- tasks[[1]]
>
>         s <- stderr()
>         cat(tasks, "--1--\n", file = s)
>         cat(ls(private$scheduledTasks), "--2--\n", file = s)
>         cat(t, "--3--\n", file = s)
>
>         task <- private$scheduledTasks[[t]]
>         rm(list = t, envir = private$scheduledTasks)
>
>         task()
>       }
>
> Without going into too much detail, it should print lines of text that
> end with --1--, --2--, --3--, and repeat. Here's what it prints
> instead when running the tests:
>
> 20170405-182549.466875-18 20170405-182559.456628-17
> 20170405-182729.456318-16 --1--
> 20170405-182549.466875-18 20170405-182559.456628-17
> 20170405-182729.456318-16 --1--
> 20170405-182549.466875-18 20170405-182559.456628-17
> 20170405-182729.456318-16 --2--
> 20170405-182549.466875-18 --3--
> 20170405-182559.456628-17 20170405-182729.456318-16 --1--
> 20170405-182559.456628-17 20170405-182729.456318-16 --2--
> 20170405-182559.456628-17 --3--
> 20170405-182729.456318-16 --1--
> 20170405-182729.456318-16 --2--
> 20170405-182729.456318-16 --3--
>  --2--
> 20170405-182549.466875-18 --3--
> 1. Error: pool scheduler: schedules things in the right order
> (@test-scheduling.R#13)
> could not find function "task"
> 1: naiveScheduler$protect({
>        scheduleTask(1e+05, function() {
>            results <<- c(results, 3L)
>        })
>        scheduleTask(10000, function() {
>            results <<- c(results, 2L)
>        })
>        scheduleTask(10, function() {
>            results <<- c(results, 1L)
>        })
>    }) at testthat/test-scheduling.R:13
> 2: private$refCount$release() at testthat/test-scheduling.R:13
> 3: private$callback()
>
> It's almost as though, in the middle of the first iteration of the
> while loop, R jumps to the next iteration of the loop, runs the loop a
> couple of times to completion, and then returns to the first iteration
> of the loop at the place that it left.
>
> This can be reproduced by following the instructions in this gist:
>   https://gist.github.com/wch/2596a1c9f1bcdee91bb210c782141c88
>
> Almost any change to the code will make the error disappear, or change
> to a different one.
>
>
> With regard to reproducing it in "base" R: I made a simple example
> using just R (no other packages) that does something similar to what
> happens in the tests, but even when I run it for 100,000 iterations,
> the error doesn't occur.
>
> I think there's a good chance that this is due to a bug in R. I have
> been trying to track down the cause of the problem but haven't been
> able find it.
>
> -Winston
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list