[Bioc-devel] Dimension (rows & columns) in the writeCDF function seems to be swapped

Kasper Daniel Hansen khansen at stat.berkeley.edu
Thu May 20 14:30:54 CEST 2010


Your zip file does not seem to be working.

Kasper

On Thu, May 20, 2010 at 6:12 AM, Groot, Philip de <philip.degroot at wur.nl> wrote:
> Hello Karst,
>
> You can download all required files via this link:
> https://sendit.wur.nl/Download.aspx?id=9eed2afd-40a8-4770-b44b-5c0b6e55084c
>
> Simply execute the commands in the Create_CDF.R script and everything you need will be created and installed.
>
> Note that also a fixed "PdInfo2Cdf.R" file is included. To create a CDF with the rows & columns NOT swapped (so that gives the error during RMA) please edit the script at the commented-out lines. Currently, the script creates a proper CDF that also does the RMA normalization properly.
>
> Thank you for your help!
>
> Regards,
>
> Dr. Philip de Groot Ph.D.
> Bioinformatics Researcher
>
> Wageningen University / TIFN
> Nutrigenomics Consortium
> Nutrition, Metabolism & Genomics Group
> Division of Human Nutrition
> PO Box 8129, 6700 EV Wageningen
> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
> Dreijenweg 2, 6703 HA  Wageningen
> Room: 0052a
> T: +31-317-485786
> F: +31-317-483342
> E-mail:   Philip.deGroot at wur.nl
> Internet: http://www.nutrigenomicsconsortium.nl
>             http://humannutrition.wur.nl/
>             https://madmax.bioinformatics.nl/
> ________________________________________
> From: kasperdanielhansen at gmail.com [kasperdanielhansen at gmail.com] On Behalf Of Kasper Daniel Hansen [khansen at stat.berkeley.edu]
> Sent: 19 May 2010 18:34
> To: Groot, Philip de
> Cc: bioc-devel at stat.math.ethz.ch; Hooiveld, Guido
> Subject: Re: [Bioc-devel] Dimension (rows & columns) in the writeCDF function   seems to be swapped
>
> Hi Philip
>
> I am trying to understand what you report here.
>
> If I use the "wrong" cdf file and I read it using either readCdfHeader
> in affxparser or read.cdffile.list (full output below) in affyio I get
>  Cols = 990
>  Rows = 1190
> and I have looked at the binary file using hexdump and this is also
> what is in the file according to Affymetrix file format
> specifications.
>
> This means that I can understand your message as either
> 1) the file pdmogene11stv1_wrong.cdf does not reflect the dimensions
> in the PDinfo package.
> 2) Using this cdf file as input to makecdfenv yields wrong results.
>
> Now, when I install the pd.mogene.1.1.st.v1 package I get
>> pd.mogene.1.1.st.v1
> Class........: AffyGenePDInfo
> Manufacturer.: Affymetrix
> Genome Build.: MM09
> Chip Geometry: 1190 rows x  990 columns
>
> which looks correct.  Not that the pdinfo2cdf script posted on the
> aroma.affymetrix website has an error (the lines
>  celHead <- readCelHeader(celfile)
>  nrows <- celHead$rows
>  ncols <- celHead$rows
> ), but you say in your email you have modifed the script (which never
> made it through), and given that the CDF file corresponds to the
> pdinfo object you have probably fixed this).
>
> That leaves 2).  A casual inspection of the tarball
>  mogene11stv1cdf_2.1.0_notSwitched.tar.gz
> makes it look as if the dimensions here are correct as well.  However,
> I suspect I am missing something.
>
> In order to debug this further (and there could very well be an error
> somewhere, where cols and rows are switched) we will need your scripts
> for generating the normalized intensities and we probably also need
> the cell files.  Please modify your scripts to use the exact same
> filenames as the packages you have posted on sendit.
>
> And not that you cannot attach anything (or at least, your attachments
> does not make it through to me).
>
> Kasper
>
>
>> tmp = read.cdffile.list("pdmogene11stv1_wrong.cdf")
>> tmp$Header
> $Dimensions
>  MagicNumber VersionNumber          Cols          Rows     n.QCunits
>           67             1           990          1190             0
>      n.units     LenRefSeq
>        35556             0
>
> $ReseqRefSeq
> [1] ""
>
>> readCdfHeader("pdmogene11stv1_wrong.cdf")
> $ncols
> [1] 990
>
> $nrows
> [1] 1190
>
> $nunits
> [1] 35556
>
> $nqcunits
> [1] 0
>
> $refseq
> [1] ""
>
> $chiptype
> [1] "pdmogene11stv1_wrong"
>
> $filename
> [1] "./pdmogene11stv1_wrong.cdf"
>
> $rows
> [1] 1190
>
> $cols
> [1] 990
>
> $probesets
> [1] 35556
>
> $qcprobesets
> [1] 0
>
> $reference
> [1] ""
>
>>
>
>
> On Wed, May 19, 2010 at 4:00 AM, Groot, Philip de <philip.degroot at wur.nl> wrote:
>> Dear Kasper,
>>
>> I am contacting you because I am puzzled with an issue that is related to affxparser. I submitted this message to the bioc-devel list on april 29 (subject: "RE: [Bioc-devel] FW: problem in ReadAffy function (affy and affyio libraries)") but never received any reply. Interestingly, in the file writeCDF.R (affxparser source) the following comment is present:
>>
>> # 2006-09-11 /HB
>> # o BUG FIX: nrows & ncols were swapped in the CDF header.
>> The problem:
>> I succesfully created a "mogene11stv1cdf" library that can be used with the "affy" library. Benefit of this is that e.g. the affyPLM library can be used for creating informative plots. I applied rma utilizing affy and oligo and found exactly the same normalized intensities (see attached png-image).
>>
>> So far so good, BUT... in order to get rma working for affy, I needed to switch the number or rows and columns when creating the CDF-file. And this puzzles me a lot! I do not understand where this originates from.
>>
>> Let me explain. I use the oligo library and the "pd.mogene.1.1.st.v1" annotation file to create the CDF-file. I (together with Guido Hooiveld) adapted the original "PdInfo2Cdf.R" script for this purpose. Information on the original "PdInfo2Cdf.R" script is here: http://www.aroma-project.org/node/40. The adapted file is also in the attachment.
>>
>> The CDF-file is created as follows:
>> source("PdInfo2Cdf.R")
>> PdInfo2Cdf("pd.mogene.1.1.st.v1", <An appropriate .CEL-file>);
>>
>> library(makecdfenv)
>> make.cdf.package(file="pdmogene11stv1.cdf", packagename = "mogene11stv1cdf", author="Philip de Groot", maintainer="Philip de Groot <Philip.deGroot at wur.nl>", version="2.1.0", species="Mus_musculus")
>>
>> Both CDF-files are available in a single zip-file via sendit:
>> https://sendit.wur.nl/Download.aspx?id=9299e472-1d5a-4f83-af75-c10aa1faf94c
>>
>> Does somebody has an explanation for this? Is it a bug in writeCDF?
>>
>> Regards,
>>
>> Dr. Philip de Groot Ph.D.
>> Bioinformatics Researcher
>>
>> Wageningen University / TIFN
>> Nutrigenomics Consortium
>> Nutrition, Metabolism & Genomics Group
>> Division of Human Nutrition
>> PO Box 8129, 6700 EV Wageningen
>> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
>> Dreijenweg 2, 6703 HA  Wageningen
>> Room: 0052a
>> T: +31-317-485786
>> F: +31-317-483342
>> E-mail:   Philip.deGroot at wur.nl<mailto:Philip.deGroot at wur.nl>
>> Internet: http://www.nutrigenomicsconsortium.nl<http://www.nutrigenomicsconsortium.nl/>
>>             http://humannutrition.wur.nl/
>>             https://madmax.bioinformatics.nl/
>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list