[Bioc-devel] affxparser: Core dump with R 2.14.x on OSX [take #2]

Dan Tenenbaum dtenenba at fhcrc.org
Sat Jan 21 01:43:53 CET 2012


On Fri, Jan 20, 2012 at 3:51 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
> On Fri, Jan 20, 2012 at 3:20 PM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
>> [bringing back to the list, because we could need some help from other
>> developers with access to various OSX versions]
>>
>> Hi Dan,
>>
>> thanks for looking into this.
>>
>> On Fri, Jan 20, 2012 at 2:17 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>> On Fri, Jan 20, 2012 at 1:57 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>>> On Fri, Jan 20, 2012 at 1:44 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
>>>>> Hi Henrik,
>>>>>
>>>>>
>>>>> On Fri, Jan 20, 2012 at 10:33 AM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> this is a kind request for the BioC team to have another look at
>>>>>> fixing the binary affxparser builds.  Quite a few OSX users on R
>>>>>> v2.14.0 have R crashing because of this problem.
>>>>>
>>>>> Thanks for the prompt and the detailed problem report.
>>>>>
>>>>>
>>>>>>
>>>>>> Since thread 'Re: [Bioc-devel] affxparser: Core dump with R 2.14.0 on
>>>>>> OSX' on Nov 7, 2011
>>>>>> [https://stat.ethz.ch/pipermail/bioc-devel/2011-November/002969.html]
>>>>>> became cluttered with mistakes, I'm starting a new thread on the same
>>>>>> topic.
>>>>>>
>>>>>> PROBLEM:
>>>>>> The binary build of affxparser v1.26.2 for OSX provided by
>>>>>> Bioconductor is broken and causes R v2.14.0 to crash ("core dump",
>>>>>> "abort trap", ...) on OSX 10.6 ("Snow Leopard") and (I assume; someone
>>>>>> please confirm) OSX 10.7 ("Lion"),
>>>>>
>>>>> I can confirm that it happens on Lion too.
>>>>>
>>>>>> but not OSX 10.5 ("Leopard").  A
>>>>>> reproducible example is:
>>>>>>
>>>>>> library("affxparser");
>>>>>> readCdfHeader("Mapping10K_Xba142.cdf");
>>>>>>
>>>>>> which should return a named header. (Download CDF file:
>>>>>> http://www.aroma-project.org/data/annotationData/chipTypes/Mapping10K_Xba142/Mapping10K_Xba142.CDF.gz
>>>>>> ; 2.2Mb).  Another example is
>>>>>> [http://www.aroma-project.org/data/annotationData/chipTypes/Mapping250K_Nsp/Mapping250K_Nsp.cdf.gz]:
>>>>>>
>>>>>> library("affxparser");
>>>>>> readCdfHeader("Mapping250K_Nsp.cdf");
>>>>>>
>>>>>>
>>>>>> CURRENT WORKAROUNDS:
>>>>>> - Install affxparser from source
>>>>>> [http://bioconductor.org/packages/2.9/bioc/src/contrib/affxparser_1.26.2.tar.gz].
>>>>>> - Install Kasper Hansen's binary build (not universal?)
>>>>>> [http://www.braju.com/R/repos/osx_10.6/affxparser_1.26.2.tgz] that
>>>>>> works on (at least) OSX 10.6.8.
>>>>>>
>>>>>> See also aroma.affymetrix thread 'OSX 10.6 & 10.7 users: Workaround
>>>>>> for faulty BioC build of affxparser v1.26.2' on Jan 14, 2012
>>>>>> [https://groups.google.com/forum/#!topic/aroma-affymetrix/lEfDanThLEA/discussion]
>>>>>>
>>>>>>
>>>>>> TROUBLESHOOTING:
>>>>>> I can confirm that installing from source, works on an OSX 10.6.8
>>>>>> machine with R v2.14.1
>>>>>> (http://cran.r-project.org/bin/macosx/R-2.14.1.pkg).  Installing
>>>>>> Kasper's binary build also works.  I've a limited understanding on the
>>>>>> different types of OSX package binaries, only access to OSX 10.6.8,
>>>>>> making it hard for me to do any more troubleshooting, but as far as I
>>>>>> understand there is something wrong with the way affxparser is build
>>>>>> on the Bioconductor servers.
>>>>>>
>>>>>
>>>>>
>>>>> An important fact to bear in mind is that the BioC Mac build servers
>>>>> are running Leopard (OS X 10.5.8).
>>>>>
>>>>> It's a bit tricky to debug since it works fine on the platform it's
>>>>> built on...but using primitive means (Rprintf() statements), I was
>>>>> able to narrow down the problem to the
>>>>> FileHeaderReader::ReadMagicNumber()
>>>>> function in
>>>>> affxparser/src/fusion_sdk/calvin_files/parsers/src/FileHeaderReader.cpp
>>>>>
>>>>> In that function, the expression
>>>>> if (fileMagicNumber != DATA_FILE_MAGIC_NUMBER)
>>>>> evaluates to true, and therefore an
>>>>> affymetrix_calvin_exceptions::InvalidFileTypeException is thrown.
>>>>>
>>>>> I don't really know why the magic number is wrong, or would vary
>>>>> between operating systems, but perhaps this gives you something to go
>>>>> on?
>>>>>
>>>>> BTW, the trace is:
>>>>> R: readCdfHeader()
>>>>> C++:
>>>>> R_affx_get_cdf_file_header()
>>>>> FusionCDFData::ReadHeader()
>>>>> FusionCDFData::CreateObject()
>>>>> FusionCDFData::IsCalvinCompatibleFile()
>>>>> GenericFileReader::ReadFileHeaderNoDataGroupHeader()
>>>>> FileHeaderReader::Read()
>>>>> FileHeaderReader::ReadMagicNumber()
>>>>>
>>>>> Hope this helps. If I can be of assistance in further debugging this,
>>>>> please let me know.
>>>>
>>>> I should also have mentioned that in the ReadMagicNumber() function,
>>>> fileMagicNumber == 67
>>>> with the file Mapping10K_Xba142.cdf, and the expected magic number is 59.
>>>
>>> Sorry for all the emails, but here's one more piece of info:
>>> If I run the package with the debug statements on pitt, our Leopard
>>> build machine, it works fine, as expected, but it also reports that
>>> fileMagicNumber is 67. So the exception is still thrown, but execution
>>> continues.
>>
>> So when "execution continues" despite the incorrect magic number, do
>> still get a valid CDF header readout at the R prompt?
>
> Yes. Or at least, I assume it is valid. But no other errors are displayed.
>
> Here is what it displays:
>
> $ncols
> [1] 658
>
> $nrows
> [1] 658
>
> $nunits
> [1] 10208
>
> $nqcunits
> [1] 9
>
> $refseq
> [1] ""
>
> $chiptype
> [1] "Mapping10K_Xba142"
>
> $filename
> [1] "./Mapping10K_Xba142.CDF"
>
> $rows
> [1] 658
>
> $cols
> [1] 658
>
> $probesets
> [1] 10208
>
> $qcprobesets
> [1] 9
>
> $reference
> [1] ""
>
>
>
>>
>>> Whereas on my Lion machine, execution ends (after a pause) with "Abort trap: 6".
>>> So I am not sure whether this exception is really part of the problem,
>>> or just a red herring.
>>
>> It could be a red herring; the incorrectly read magic header (first
>> byte in the file) is just a side effect of something more complicated,
>> but it is definitely a start.  It is also a hint that we could/should
>> update affxparser to at least catch this and give an error instead of
>> crashing (but I'm sure if we should play with such updates, while
>> troubleshooting the real cause).
>>
>> There is one more important clue available. This problem started to
>> occur with BioC 2.9 and R v2.14.x.  Previous BioC builds of affxparser
>> did not cause this, and by even forcing an installation of the old
>> affxparser v1.24.0 binaries on R v2.14.1 on OSX 10.6.8:
>>
>>  http://bioconductor.org/packages/2.8/bioc/bin/macosx/leopard/contrib/2.13/affxparser_1.24.0.tgz
>>
>> it works.  So, something "happened" between:
>>
>> affxparser_1.24.0.tgz:
>> Packaged: 2011-04-15 09:35:06 UTC; biocbuild
>> Built: R 2.13.0; universal-apple-darwin9.8.0; 2011-04-15 16:46:28 UTC; unix
>> Archs: i386, ppc, x86_64
>>
>> and
>>
>> affxparser_1.26.2.tgz
>> Packaged: 2011-11-17 06:38:13 UTC; biocbuild
>> Built: R 2.14.0; universal-apple-darwin9.8.0; 2011-11-17 15:39:53 UTC; unix
>> Archs: i386, ppc, x86_64
>>
>> (the first known report on this problem is from Nov 7, 2011
>> [http://goo.gl/ZqBsW], which is before the date of the latter).  There
>> is only one real update in affxparser v1.26.1, but that is in pure R
>> code and more importantly not in code used in this bug report.  So,
>> rebuilding affxparser v1.24.0 on the BioC server will most likely
>> cause the same crash as affxparser v1.26.2 does.
>>
>>
>> BTW, are you planning to update to R v2.14.1 on the BioC OSX servers?
>> With some luck, maybe that will fix it.
>
> pitt is already running R 2.14.1, see:
> http://bioconductor.org/checkResults/release/bioc-LATEST/pitt-NodeInfo.html
>
>>
>> It would be great if someone else with OSX 10.5.8 ("Leopard") could
>> build/install affxparser v1.26.2 from source are share it with us for
>> testing on OSX 10.6 & 10.7; that would help narrow down the source of
>> the problem.  If such a build works, then it is much more likely that
>> there is something with the BioC OSX 10.5.8 server setup, whereas if
>> it also crashes, then we might have to search for the problem
>> elsewhere.
>
> Sounds good, although we already tried this test, with another Leopard
> machine we had, and the resulting package also crashed on newer OSes.
> But it is probably still valuable to have someone else try this, and
> if they create a package that works on newer OSes, then we can start
> to look at differences in compilers, etc.
>
> BTW, the last known 'good' version was built on pelham, here is the
> node info for that machine:
> http://bioconductor.org/checkResults/2.8/bioc-20111021/pelham-NodeInfo.html
>
> It looks like the same versions of C and C++ compilers were used on both.
> One difference is that R CMD config CXX on pelham is "g++-4.2 -arch
> i386" whereas on pitt it is "g++ -arch i386", however, on pitt,
> ls -l /usr/bin/g++
> lrwxr-xr-x  1 root  wheel  7 Jun 29  2011 /usr/bin/g++ -> g++-4.2
> so I guess they are really equivalent.
> (The same is true of gfortran, if that matters.)
>
> This makes me wonder if the "something" that "happened" could have
> happened between R 2.13 and 2.14. Of course, the problem seems to be
> deep inside C++ code that is not even using SEXPs, but it's possible
> something could have changed in .Call()...
>
> I'm in the process of doing further testing (trying to build my debug
> version on Lion and seeing if the magic number mismatch occurs). Will
> let you know what happens.


A couple more data points:
If I build affxparser on Lion, the magic number mismatch still occurs,
but the R function call completes without error.

If I comment out the "throw" in FileHeaderReader::ReadMagicNumber(),
and build it on pitt, the same error occurs on Lion. So it doesn't
really matter, apparently, whether that exception is thrown, but that
function is as far as I am able to trace before the "abort trap".

FWIW,
Dan

> Dan
>
>
>>
>>
>> Thanks,
>>
>> Henrik
>>>
>>> Hope this is helpful....
>>> Dan
>>>
>>>
>>>>
>>>> Dan
>>>>
>>>>> Thanks,
>>>>> Dan
>>>>>
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Henrik
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioc-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list