[Bioc-devel] FW: problem in ReadAffy function (affy and affyio libraries)

Groot, Philip de philip.degroot at wur.nl
Thu Apr 29 16:06:31 CEST 2010


Hello Ben,

Thank you for your efforts! Clearly, I misinterpreted the "size of arrays" line. Can the output be more clear (e.g. by explaining the dimensions)? However, the reason why I ran into this problem is the following.

I succesfully created a "mogene11stv1cdf" library that can be used with the "affy" library. Benefit of this is that e.g. the affyPLM library can be used for creating informative plots. I applied rma utilizing affy and oligo and found exactly the same normalized intensities (see attached png-image).

So far so good, BUT... in order to get rma working for affy, I needed to switch the number or rows and columns when creating the CDF-file. And this puzzles me a lot! I do not understand where this originates from. This is why I suspected a problem in affy, but apparently this is not the case.

Let me explain. I use the oligo library and the "pd.mogene.1.1.st.v1" annotation file to create the CDF-file. I (together with Guido Hooiveld) adapted the original "PdInfo2Cdf.R" script for this purpose. Information on the original "PdInfo2Cdf.R" script is here: http://www.aroma-project.org/node/40. The adapted file is also in the attachment.

The CDF-file is created as follows: 
source("PdInfo2Cdf.R")
PdInfo2Cdf("pd.mogene.1.1.st.v1", <An appropriate .CEL-file>);

library(makecdfenv)
make.cdf.package(file="pdmogene11stv1.cdf", packagename = "mogene11stv1cdf", author="Philip de Groot", maintainer="Philip de Groot <Philip.deGroot at wur.nl>", version="2.1.0", species="Mus_musculus")

Both CDF-files are available in a single zip-file via sendit:
https://sendit.wur.nl/Download.aspx?id=dff061c6-ac50-4231-bc61-b48be7ecda2d

Does somebody has an explanation for this?

Regards,

Dr. Philip de Groot Ph.D.
Bioinformatics Researcher

Wageningen University / TIFN
Nutrigenomics Consortium
Nutrition, Metabolism & Genomics Group
Division of Human Nutrition
PO Box 8129, 6700 EV Wageningen
Visiting Address: Erfelijkheidsleer: De Valk, Building 304
Dreijenweg 2, 6703 HA  Wageningen
Room: 0052a
T: +31-317-485786
F: +31-317-483342
E-mail:   Philip.deGroot at wur.nl
Internet: http://www.nutrigenomicsconsortium.nl
             http://humannutrition.wur.nl/
             https://madmax.bioinformatics.nl/
________________________________________
From: bmb at bmbolstad.com [bmb at bmbolstad.com]
Sent: 28 April 2010 22:27
To: Groot, Philip de
Cc: bmb at bmbolstad.com; bioc-devel at stat.math.ethz.ch
Subject: RE: [Bioc-devel] FW: problem in ReadAffy function (affy and affyio   libraries)

Well, I can investigate further, but this is not the first rectangular
chip encountered by the affyio code. For instance the SNP6.0 array is in a
similar format and can be read successfully as can be seen by examining
images of the intensity data and looking for control regions and so forth.

Furthermore, ReadAffy has always returned columns before rows. Why?
Because that is the historic order in the original text CEL file format.

Ben


> Dear Ben,
>
> ReadAffy is reading the data in incorrectly because the wrong chip
> dimensions are reported. Consequently, the dimension of the intensity
> matrix is also wrong.
>
> Regards,
>
> Dr. Philip de Groot Ph.D.
> Bioinformatics Researcher
>
> Wageningen University / TIFN
> Nutrigenomics Consortium
> Nutrition, Metabolism & Genomics Group
> Division of Human Nutrition
> PO Box 8129, 6700 EV Wageningen
> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
> Dreijenweg 2, 6703 HA  Wageningen
> Room: 0052a
> T: +31-317-485786
> F: +31-317-483342
> E-mail:   Philip.deGroot at wur.nl
> Internet: http://www.nutrigenomicsconsortium.nl
>              http://humannutrition.wur.nl/
>              https://madmax.bioinformatics.nl/
> ________________________________________
> From: bmb at bmbolstad.com [bmb at bmbolstad.com]
> Sent: 28 April 2010 18:26
> To: Groot, Philip de
> Cc: bioc-devel at stat.math.ethz.ch
> Subject: Re: [Bioc-devel] FW: problem in ReadAffy function (affy and
> affyio  libraries)
>
> Just to be clear are you reporting that ReadAffy is reading in the data
> incorrectly or just that row and column information is reported in
> incorrect order?
>
>> I guess that this list is a more appropriate place... (see message
>> below)
>>
>> Regards,
>>
>> Dr. Philip de Groot Ph.D.
>> Bioinformatics Researcher
>>
>> Wageningen University / TIFN
>> Nutrigenomics Consortium
>> Nutrition, Metabolism & Genomics Group
>> Division of Human Nutrition
>> PO Box 8129, 6700 EV Wageningen
>> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
>> Dreijenweg 2, 6703 HA  Wageningen
>> Room: 0052a
>> T: +31-317-485786
>> F: +31-317-483342
>> E-mail:   Philip.deGroot at wur.nl<mailto:Philip.deGroot at wur.nl>
>> Internet:
>> http://www.nutrigenomicsconsortium.nl<http://www.nutrigenomicsconsortium.nl/>
>>              http://humannutrition.wur.nl/
>>              https://madmax.bioinformatics.nl/
>>
>> ________________________________
>> From: Groot, Philip de
>> Sent: 28 April 2010 16:40
>> To: bmb at bmbolstad.com; rafa at jhu.edu
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: problem in ReadAffy function (affy and affyio libraries)
>>
>> Hello all,
>>
>> I am working with the Affymetrix GeneTitan Gene ST plates
>> ("mogene11stv1"
>> arrays). Information on these arrays via the Affymetrix website:
>> http://www.affymetrix.com/support/technical/byproduct.affx?product=MoGene-1_1-st-v1
>>
>> When using the "affy" library to load the .CEL-files, I obtain the
>> following AffyBatch object:
>>> x <- ReadAffy()
>>> x
>> AffyBatch object
>> size of arrays=990x1190 features (20 kb)
>> cdf=MoGene-1_1-st-v1 (35556 affyids)
>> number of samples=9
>> number of genes=35556
>> annotation=mogene11stv1
>> notes=
>> Please notice that the "size of arrays" is wrong: x and y have been
>> switched. When I use the "affxparser" library to obtain the "size of
>> arrays" things work out fine:
>>> library(affxparser)
>>> celHead <- readCelHeader(list.celfiles()[1])
>>> c(celHead$rows, celHead$cols)
>> [1] 1190  990
>> Now the "size of arrays" is correct.
>>
>> I located the problem in the ReadAffy() function in the following line:
>> function: read.celfile.header (called from within the ReadAffy function)
>>
>>     headdetails <- .Call("ReadHeader", filename, PACKAGE = "affyio")
>> As you can see, the actual problem is in the affyio library, where the
>> .Call() command returns an inappropriate "headdetails" object.
>> Consequently, the dimensions of "exprs(AffyBatch)", after loading the
>> .CEL-files, is also wrong.
>>
>> Note that for the Gene ST version 1.0 arrays the above is not a problem
>> because the arrays are square. The Gene ST arrays version 1.1 are
>> rectangularly defined. The problem is also present in R-2.10.1.
>>
>> Can this problem be fixed please?
>>
>>> sessionInfo()
>> R version 2.11.0 (2010-04-22)
>> x86_64-unknown-linux-gnu
>> locale:
>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> other attached packages:
>> [1] affxparser_1.20.0     mogene11stv1cdf_2.1.0 affy_1.26.0
>> [4] Biobase_2.8.0
>> loaded via a namespace (and not attached):
>> [1] affyio_1.16.0         preprocessCore_1.10.0 tools_2.11.0
>> Regards,
>>
>> Dr. Philip de Groot Ph.D.
>> Bioinformatics Researcher
>>
>> Wageningen University / TIFN
>> Nutrigenomics Consortium
>> Nutrition, Metabolism & Genomics Group
>> Division of Human Nutrition
>> PO Box 8129, 6700 EV Wageningen
>> Visiting Address: Erfelijkheidsleer: De Valk, Building 304
>> Dreijenweg 2, 6703 HA  Wageningen
>> Room: 0052a
>> T: +31-317-485786
>> F: +31-317-483342
>> E-mail:   Philip.deGroot at wur.nl<mailto:Philip.deGroot at wur.nl>
>> Internet:
>> http://www.nutrigenomicsconsortium.nl<http://www.nutrigenomicsconsortium.nl/>
>>              http://humannutrition.wur.nl/
>>              https://madmax.bioinformatics.nl/
>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>


More information about the Bioc-devel mailing list