[Bioc-devel] Compilation flags, CHECK errors and BiocNeighbors
Aaron Lun
infinite@monkey@@with@keybo@rd@ @ending from gm@il@com
Fri Dec 21 13:45:27 CET 2018
Thanks Val. Looks like BiocNeighbors is all green again in the latest build, so that’s a relief.
One down, two to go - Windows CHECK failures seem to be the tokay2’s idea of Christmas presents.
-A
> On 20 Dec 2018, at 19:52, Obenchain, Valerie <Valerie.Obenchain using roswellpark.org> wrote:
>
> The problem is that during the nightly builds, one of the Bioconductor
> packages writes out a .R/Makevars.win in biocbuild's HOME during R CMD
> build.
>
> Yesterday I removed the .R/ directory before the builds started and, as
> expected, today's NodeInfo on tokay2 and packages using the C++11 show
> the correct flags.
>
> If this .R/Makevars.win is not removed, it will (and did in the past)
> pollute the next build cycle such that the NodeInfo and all packages
> using C++11 would report/use the wrong flags.
>
> I think I've narrowed down which package is doing this and will contact
> the maintainer. We'll also implement some sanitation code in the BBS to
> prevent this from happening again.
>
> The reason HOME is writable is that many applications need to create
> files (often hidden) such as lock files, cache, config files etc. If
> they can't, they'll break and they will sometimes break in a subtle way
> that is not immediately obvious.
>
> One last follow up is to explain why the previous iteration of the
> NodeInfo on the build report reported the incorrect C++11 flags. The
> problem there was that previously we were only picking up CXX1XFLAGS
> instead of the individual CXX11FLAGS, CXX14FLAGS etc.
>
> Thanks for being persistent on this issue and for bringing the
> conversation to bioc-devel.
>
> Val
>
>
>
> On 12/18/18 8:39 AM, Obenchain, Valerie wrote:
>> The devel build report hasn't posted yet but I took a look at the new
>> compiler flag output Herve implemented. The results show tokay2 is
>> indeed using
>>
>> CXX11FLAGS: -O3 -march=native -mtune=native
>>
>> This is inconsistent with what we have in the R/etc/<arch>/Makeconf for
>> both architectures on both tokay1 and tokay2. The Makeconf looks like this:
>>
>> CXX11 = $(BINPREF)g++ $(M_ARCH)
>> CXX11FLAGS = -O2 -Wall $(DEBUGFLAG) -mtune=generic
>> CXX11PICFLAGS =
>> CXX11STD = -std=gnu++11
>>
>> I don't know why the Makeconf is not being respected on tokay2. I can
>> confirm the inconsistency in an R session -
>>
>> tokay2:
>>
>> PS C:\Users\biocbuild\bbs-3.9-bioc\R> ./bin/R CMD config CXX11FLAGS
>> -O3 -march=native -mtune=native
>>
>> tokay1:
>>
>> PS C:\Users\biocbuild\bbs-3.8-bioc\R> ./bin/R CMD config CXX11FLAGS
>> -O2 -Wall -mtune=generic
>>
>> I'll work with Herve to resolve this.
>>
>> Val
>>
>>
>>
>> On 12/17/18 5:05 PM, Aaron Lun wrote:
>>> Thanks Val. I don�t think it�s a BiocNeighbors thing, as it doesn�t try
>>> to customize the compilation flags or have its own Makevars. Moreover,
>>> the �-O3 -mtune=native -mtune=generic� flags seem to show up on all of
>>> my packages containing C++11 code. Some cursory checks of other packages
>>> suggest that the correct flags (�-O2 -mtune=generic�) are used for C++98
>>> code.
>>>
>>> -A
>>>
>>>> On 17 Dec 2018, at 17:47, Obenchain, Valerie <Valerie.Obenchain using RoswellPark.org> wrote:
>>>>
>>>> Hi Aaron,
>>>>
>>>> The only compilation flags that are different for tokay1 (release) and
>>>> tokay2 (devel) are C++14 flags. BiocNeighbors is not using C++14 but
>>>> C++11 so I think the changes we discussed previously actually don't
>>>> apply to your case.
>>>>
>>>> All compilation flags we use are listed at the top of the build report,
>>>> e.g., for tokay2:
>>>>
>>>> https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html
>>> <https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html <https://www.bioconductor.org/checkResults/devel/bioc-LATEST/tokay2-NodeInfo.html>>
>>>>
>>>> I can look into this further but right now I'm not sure where the '-O3
>>>> -march=native -mtune=native' is coming from in the check output for
>>>> BiocNeighbors. We don't use 'native' on the builders for build/check or
>>>> for creating binaries.
>>>>
>>>> Herve might have more insight on this.
>>>>
>>>> Val
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 12/15/18 10:56 PM, Aaron Lun wrote:
>>>>> Sometime between 6-18 November, BiocNeighbors� BioC-devel builds began failing on Windows 64-bit, and have continued to fail since:
>>>>>
>>>>> http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/
>>> <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/ <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>>
>>> <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/ <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>
>>> <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/ <http://bioconductor.org/checkResults/devel/bioc-LATEST/BiocNeighbors/>>>
>>>>>
>>>>> The most interesting part is the nature of the failures. They are not segmentation faults but rather �incorrect� output in the unit tests:
>>>>>
>>>>> - BiocNeighbors uses the Annoy algorithm for approximate nearest neighbor search, which is provided as a header-only C++ library in the RcppAnnoy package.
>>>>>
>>>>> - I have compiled the BiocNeighhbors C++ code with an �#include" for these libraries to use the Annoy routines. For testing, I compared the output of my C++ code to the output of the code in the RcppAnnoy package.
>>>>>
>>>>> - It is these tests that are failing (i.e., the output does not match up) during CHECK on Windows 64-bit only, despite the fact that the same library is being �#include�d in both the BiocNeighbors and RcppAnnoy sources!
>>>>>
>>>>> What makes this particularly intriguing is that the differences between BiocNeighbors and RcppAnnoy are very minor. Less than 1% of the neighbor identities differ, and only for some of the scenarios, so it�s not an obvious bug that would be changing the output en masse. Now, the package also uses/tests Annoy in
>>> BioC-release but builds fine on tokay1:
>>>>>
>>>>> http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/
>>> <http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/ <http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>> <http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/ <http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>
>>> <http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/ <http://bioconductor.org/checkResults/release/bioc-LATEST/BiocNeighbors/>>>
>>>>>
>>>>> The major difference between the Bioc-release/devel builds is the compilation flags, which have changed from �-O2 -mtune=generic� to �-O3 -march=native -mtune=native� in tokay2. I am told (thanks Val) that the timing of this change is consistent with the start of the BiocNeighbors build failures on tokay2. I would guess
>>> that RcppAnnoy is also compiled with �-O2 -mtune=generic� on the CRAN
>>> build systems, introducing differences in optimization levels between
>>> the BiocNeighbors and RcppAnnoy binaries. These could be responsible for
>>> the discrepancies in the search results.
>>>>>
>>>>> I was able to reproduce this on my Unix cluster (gcc 6.5.0) where setting �-march=native� with either �-O3� or �-O2� caused a difference in the calculations. After much trial and error, I eventually narrowed this down to the �-mfma� flag, which seems to change the precision of multiply-and-add operations and thus the
>>> search results. This occurs even when AVX support is turned off; I guess
>>> the compiler tries to be smart if it detects you are doing some kind of
>>> simultaneous multiply and addition, which is a pretty common thing to do
>>> when computing Euclidean distances.
>>>>>
>>>>> In summary: can we not use �-march=native� on tokay2? (Val, I know we discussed this, but whatever changes you made to the compilation flags don�t seem to have propagated to the build machines.) As the case study with BiocNeighbors shows, this leads to inconsistencies between the CRAN and BioC-devel binaries for the same code, which
>>> unnecessarily complicates downstream usage and unit tests. I also wonder
>>> how binaries specialized for tokay2�s architecture would behave on other
>>> CPUs with different instruction sets, if they would run at all.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Aaron
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> _______________________________________________
>>>>> Bioc-devel using r-project.org <mailto:Bioc-devel using r-project.org> <mailto:Bioc-devel using r-project.org <mailto:Bioc-devel using r-project.org>> mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>
>>>>>
>>>>
>>>>
>>>>
>>>> This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is
>>> prohibited. If you have received this message in error, please notify
>>> the sender immediately by e-mail and delete this email message from your
>>> computer. Thank you.
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel using r-project.org <mailto:Bioc-devel using r-project.org> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>>
>>
>> This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
>> _______________________________________________
>> Bioc-devel using r-project.org <mailto:Bioc-devel using r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>
>
>
>
> This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list