[Bioc-devel] affxparser: Core dump with R 2.14.x on OSX [take #2]

Dan Tenenbaum dtenenba at fhcrc.org
Sat Jan 21 02:40:53 CET 2012


On Fri, Jan 20, 2012 at 4:43 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
> On Fri, Jan 20, 2012 at 3:51 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>> On Fri, Jan 20, 2012 at 3:20 PM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
>>> [bringing back to the list, because we could need some help from other
>>> developers with access to various OSX versions]
>>>
>>> Hi Dan,
>>>
>>> thanks for looking into this.
>>>
>>> On Fri, Jan 20, 2012 at 2:17 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>>> On Fri, Jan 20, 2012 at 1:57 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>>>> On Fri, Jan 20, 2012 at 1:44 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>>>>> Hi Henrik,
>>>>>>
>>>>>>
>>>>>> On Fri, Jan 20, 2012 at 10:33 AM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> this is a kind request for the BioC team to have another look at
>>>>>>> fixing the binary affxparser builds.  Quite a few OSX users on R
>>>>>>> v2.14.0 have R crashing because of this problem.
>>>>>>
>>>>>> Thanks for the prompt and the detailed problem report.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Since thread 'Re: [Bioc-devel] affxparser: Core dump with R 2.14.0 on
>>>>>>> OSX' on Nov 7, 2011
>>>>>>> [https://stat.ethz.ch/pipermail/bioc-devel/2011-November/002969.html]
>>>>>>> became cluttered with mistakes, I'm starting a new thread on the same
>>>>>>> topic.
>>>>>>>
>>>>>>> PROBLEM:
>>>>>>> The binary build of affxparser v1.26.2 for OSX provided by
>>>>>>> Bioconductor is broken and causes R v2.14.0 to crash ("core dump",
>>>>>>> "abort trap", ...) on OSX 10.6 ("Snow Leopard") and (I assume; someone
>>>>>>> please confirm) OSX 10.7 ("Lion"),
>>>>>>
>>>>>> I can confirm that it happens on Lion too.
>>>>>>
>>>>>>> but not OSX 10.5 ("Leopard").  A
>>>>>>> reproducible example is:
>>>>>>>
>>>>>>> library("affxparser");
>>>>>>> readCdfHeader("Mapping10K_Xba142.cdf");
>>>>>>>
>>>>>>> which should return a named header. (Download CDF file:
>>>>>>> http://www.aroma-project.org/data/annotationData/chipTypes/Mapping10K_Xba142/Mapping10K_Xba142.CDF.gz
>>>>>>> ; 2.2Mb).  Another example is
>>>>>>> [http://www.aroma-project.org/data/annotationData/chipTypes/Mapping250K_Nsp/Mapping250K_Nsp.cdf.gz]:
>>>>>>>
>>>>>>> library("affxparser");
>>>>>>> readCdfHeader("Mapping250K_Nsp.cdf");
>>>>>>>
>>>>>>>
>>>>>>> CURRENT WORKAROUNDS:
>>>>>>> - Install affxparser from source
>>>>>>> [http://bioconductor.org/packages/2.9/bioc/src/contrib/affxparser_1.26.2.tar.gz].
>>>>>>> - Install Kasper Hansen's binary build (not universal?)
>>>>>>> [http://www.braju.com/R/repos/osx_10.6/affxparser_1.26.2.tgz] that
>>>>>>> works on (at least) OSX 10.6.8.
>>>>>>>
>>>>>>> See also aroma.affymetrix thread 'OSX 10.6 & 10.7 users: Workaround
>>>>>>> for faulty BioC build of affxparser v1.26.2' on Jan 14, 2012
>>>>>>> [https://groups.google.com/forum/#!topic/aroma-affymetrix/lEfDanThLEA/discussion]
>>>>>>>
>>>>>>>
>>>>>>> TROUBLESHOOTING:
>>>>>>> I can confirm that installing from source, works on an OSX 10.6.8
>>>>>>> machine with R v2.14.1
>>>>>>> (http://cran.r-project.org/bin/macosx/R-2.14.1.pkg).  Installing
>>>>>>> Kasper's binary build also works.  I've a limited understanding on the
>>>>>>> different types of OSX package binaries, only access to OSX 10.6.8,
>>>>>>> making it hard for me to do any more troubleshooting, but as far as I
>>>>>>> understand there is something wrong with the way affxparser is build
>>>>>>> on the Bioconductor servers.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> An important fact to bear in mind is that the BioC Mac build servers
>>>>>> are running Leopard (OS X 10.5.8).
>>>>>>
>>>>>> It's a bit tricky to debug since it works fine on the platform it's
>>>>>> built on...but using primitive means (Rprintf() statements), I was
>>>>>> able to narrow down the problem to the
>>>>>> FileHeaderReader::ReadMagicNumber()
>>>>>> function in
>>>>>> affxparser/src/fusion_sdk/calvin_files/parsers/src/FileHeaderReader.cpp
>>>>>>
>>>>>> In that function, the expression
>>>>>> if (fileMagicNumber != DATA_FILE_MAGIC_NUMBER)
>>>>>> evaluates to true, and therefore an
>>>>>> affymetrix_calvin_exceptions::InvalidFileTypeException is thrown.
>>>>>>
>>>>>> I don't really know why the magic number is wrong, or would vary
>>>>>> between operating systems, but perhaps this gives you something to go
>>>>>> on?
>>>>>>
>>>>>> BTW, the trace is:
>>>>>> R: readCdfHeader()
>>>>>> C++:
>>>>>> R_affx_get_cdf_file_header()
>>>>>> FusionCDFData::ReadHeader()
>>>>>> FusionCDFData::CreateObject()
>>>>>> FusionCDFData::IsCalvinCompatibleFile()
>>>>>> GenericFileReader::ReadFileHeaderNoDataGroupHeader()
>>>>>> FileHeaderReader::Read()
>>>>>> FileHeaderReader::ReadMagicNumber()
>>>>>>
>>>>>> Hope this helps. If I can be of assistance in further debugging this,
>>>>>> please let me know.
>>>>>
>>>>> I should also have mentioned that in the ReadMagicNumber() function,
>>>>> fileMagicNumber == 67
>>>>> with the file Mapping10K_Xba142.cdf, and the expected magic number is 59.
>>>>
>>>> Sorry for all the emails, but here's one more piece of info:
>>>> If I run the package with the debug statements on pitt, our Leopard
>>>> build machine, it works fine, as expected, but it also reports that
>>>> fileMagicNumber is 67. So the exception is still thrown, but execution
>>>> continues.
>>>
>>> So when "execution continues" despite the incorrect magic number, do
>>> still get a valid CDF header readout at the R prompt?
>>
>> Yes. Or at least, I assume it is valid. But no other errors are displayed.
>>
>> Here is what it displays:
>>
>> $ncols
>> [1] 658
>>
>> $nrows
>> [1] 658
>>
>> $nunits
>> [1] 10208
>>
>> $nqcunits
>> [1] 9
>>
>> $refseq
>> [1] ""
>>
>> $chiptype
>> [1] "Mapping10K_Xba142"
>>
>> $filename
>> [1] "./Mapping10K_Xba142.CDF"
>>
>> $rows
>> [1] 658
>>
>> $cols
>> [1] 658
>>
>> $probesets
>> [1] 10208
>>
>> $qcprobesets
>> [1] 9
>>
>> $reference
>> [1] ""
>>
>>
>>
>>>
>>>> Whereas on my Lion machine, execution ends (after a pause) with "Abort trap: 6".
>>>> So I am not sure whether this exception is really part of the problem,
>>>> or just a red herring.
>>>
>>> It could be a red herring; the incorrectly read magic header (first
>>> byte in the file) is just a side effect of something more complicated,
>>> but it is definitely a start.  It is also a hint that we could/should
>>> update affxparser to at least catch this and give an error instead of
>>> crashing (but I'm sure if we should play with such updates, while
>>> troubleshooting the real cause).
>>>
>>> There is one more important clue available. This problem started to
>>> occur with BioC 2.9 and R v2.14.x.  Previous BioC builds of affxparser
>>> did not cause this, and by even forcing an installation of the old
>>> affxparser v1.24.0 binaries on R v2.14.1 on OSX 10.6.8:
>>>
>>>  http://bioconductor.org/packages/2.8/bioc/bin/macosx/leopard/contrib/2.13/affxparser_1.24.0.tgz
>>>
>>> it works.  So, something "happened" between:
>>>
>>> affxparser_1.24.0.tgz:
>>> Packaged: 2011-04-15 09:35:06 UTC; biocbuild
>>> Built: R 2.13.0; universal-apple-darwin9.8.0; 2011-04-15 16:46:28 UTC; unix
>>> Archs: i386, ppc, x86_64
>>>
>>> and
>>>
>>> affxparser_1.26.2.tgz
>>> Packaged: 2011-11-17 06:38:13 UTC; biocbuild
>>> Built: R 2.14.0; universal-apple-darwin9.8.0; 2011-11-17 15:39:53 UTC; unix
>>> Archs: i386, ppc, x86_64
>>>
>>> (the first known report on this problem is from Nov 7, 2011
>>> [http://goo.gl/ZqBsW], which is before the date of the latter).  There
>>> is only one real update in affxparser v1.26.1, but that is in pure R
>>> code and more importantly not in code used in this bug report.  So,
>>> rebuilding affxparser v1.24.0 on the BioC server will most likely
>>> cause the same crash as affxparser v1.26.2 does.
>>>
>>>
>>> BTW, are you planning to update to R v2.14.1 on the BioC OSX servers?
>>> With some luck, maybe that will fix it.
>>
>> pitt is already running R 2.14.1, see:
>> http://bioconductor.org/checkResults/release/bioc-LATEST/pitt-NodeInfo.html
>>
>>>
>>> It would be great if someone else with OSX 10.5.8 ("Leopard") could
>>> build/install affxparser v1.26.2 from source are share it with us for
>>> testing on OSX 10.6 & 10.7; that would help narrow down the source of
>>> the problem.  If such a build works, then it is much more likely that
>>> there is something with the BioC OSX 10.5.8 server setup, whereas if
>>> it also crashes, then we might have to search for the problem
>>> elsewhere.
>>
>> Sounds good, although we already tried this test, with another Leopard
>> machine we had, and the resulting package also crashed on newer OSes.
>> But it is probably still valuable to have someone else try this, and
>> if they create a package that works on newer OSes, then we can start
>> to look at differences in compilers, etc.
>>
>> BTW, the last known 'good' version was built on pelham, here is the
>> node info for that machine:
>> http://bioconductor.org/checkResults/2.8/bioc-20111021/pelham-NodeInfo.html
>>
>> It looks like the same versions of C and C++ compilers were used on both.
>> One difference is that R CMD config CXX on pelham is "g++-4.2 -arch
>> i386" whereas on pitt it is "g++ -arch i386", however, on pitt,
>> ls -l /usr/bin/g++
>> lrwxr-xr-x  1 root  wheel  7 Jun 29  2011 /usr/bin/g++ -> g++-4.2
>> so I guess they are really equivalent.
>> (The same is true of gfortran, if that matters.)
>>
>> This makes me wonder if the "something" that "happened" could have
>> happened between R 2.13 and 2.14. Of course, the problem seems to be
>> deep inside C++ code that is not even using SEXPs, but it's possible
>> something could have changed in .Call()...
>>
>> I'm in the process of doing further testing (trying to build my debug
>> version on Lion and seeing if the magic number mismatch occurs). Will
>> let you know what happens.
>
>
> A couple more data points:
> If I build affxparser on Lion, the magic number mismatch still occurs,
> but the R function call completes without error.
>
> If I comment out the "throw" in FileHeaderReader::ReadMagicNumber(),
> and build it on pitt, the same error occurs on Lion. So it doesn't
> really matter, apparently, whether that exception is thrown, but that
> function is as far as I am able to trace before the "abort trap".

Looks like I spoke too soon on this.
If I comment out that throw, and then the one in
FileHeaderReader::ReadVersion(),
then the code continues but dies a bit more verbosely:

R(90579) malloc: *** mmap(size=18446744071864250368) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

 *** caught segfault ***
address 0x5, cause 'memory not mapped'

Traceback:
 1: .Call("R_affx_get_cdf_file_header", filename, PACKAGE = "affxparser")
 2: readCdfHeader("Mapping10K_Xba142.cdf")
aborting ...
Segmentation fault: 11

The exact place it seems to die is one of the first two lines of
std::string FileInput::ReadString8(std::ifstream &instr, int32_t len)
in FileInput.cpp.
It's a bit tricky to tell what is going on because I am limited to
printf statements. I put one before each line in this function:

        Rprintf("rs8b_0\n");
        char *buf = new char [len+1];
        Rprintf("rs8b_1\n");
        instr.read(buf, len);
        // and so on...

but the output is in an unexpected order:
[...]
rs8b_1
rs8b_0

At any rate, that is the last line of debug output before the segfault.

HTH,
Dan


>
> FWIW,
> Dan
>
>> Dan
>>
>>
>>>
>>>
>>> Thanks,
>>>
>>> Henrik
>>>>
>>>> Hope this is helpful....
>>>> Dan
>>>>
>>>>
>>>>>
>>>>> Dan
>>>>>
>>>>>> Thanks,
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Henrik
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioc-devel at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list