[BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
James W. MacDonald
jmacdon at med.umich.edu
Thu Jan 26 15:32:29 CET 2012
Hi Philip,
On 1/26/2012 4:28 AM, Groot, Philip de wrote:
> Hello all,
>
> Just to be sure:
>
>> If you follow the discussion that Mike linked to, this has been corrected in the devel version of the affy package. I made this change because>it didn't have an adverse effect on the intended target of the affy package, which is the 3' biased arrays. I have not made the change in the>release version because it isn't a bug.
> I think it is not nice that the problem will reoccur everytime a new release is present? So I do hope that the patch is included in the next Bioconductor release? Please acknowledge!
For those who don't understand how the BioC release cycle works, here is
a short primer. At any one time there are two versions; the release
version, which is considered to be stable, and the devel version, upon
which developers are still working.
At each release the developers finish up all changes they have made to
their packages, and the devel version is then split into a new release
branch, which is then 'released'. This new release is then considered to
be stable, and only bug fixes of sufficient gravity can be made. Since
this patch doesn't fix a bug, it was not applied to the release version.
Therefore, by definition, all changes made to code in the devel version
will make their way into the next release.
>
> In addition, I severely tested affy and oligo RMA normalization using either the CDF (http://nmg-r.bioinformatics.nl/NuGO_R.html) or the pd.mapping (Bioconductor oligo) libraries. The RMA results are identical upon to last digit!
>
> In conclusion: it works in both ways, so let's support it properly then! Note: I do agree that the oligo package is better suited for handling 3rd generation Affymetrix arrays, but intentionally sabotaging the affy library ((sorry, but it just looks like this) is not the way to force people to move to oligo. Just my 2 cents.
That is a pretty harsh condemnation, and I will assume that you don't
really mean it like it sounds, so I will try to show restraint.
A little background; several years ago it became clear that Affymetrix
was going to have many more types of chips than just the original 3'
biased chips for which the affy/makecdfenv pipeline was developed. After
some discussion, Rafael Irizarry decided that rather than trying to
reverse engineer an already existing and popular package to support all
these new chips (in the six month span between releases), it would be
better to create an entirely new pipeline that is intended to support
ALL chips that Affy produces. The amount of time it took to get
oligo/pdInfoBuilder to the current matured state is testament to the
wisdom of that choice. Trying to 'fix' affy in six months would have
been a disaster.
So, three points;
1.) Characterizing this as sabotage is (arrogant, ignorant, foolish,
infuriating). I leave it to others to decide which.
2.) The affy and makecdfenv packages are open source. If you (or anybody
else, for that matter) wants to fork the code into your own package that
supports all and sundry, please feel free to do so.
3.) The original plan was for the affy package to be deprecated, and
then removed from BioC. In deference to the vast user base who use this
package, and the existing personal code that is based on affy, it was
not deprecated. In addition, we have made changes where we can to make
affy accomodate these new chips, even when it isn't in anybody's
interest to do so. This, I believe, invalidates your accusation that
people are being 'forced to move to oligo'.
Best,
Jim
>
> Regards,
>
> Dr. Philip de Groot
> Bioinformatician / Microarray analysis expert
>
> Wageningen University / TIFN
> Netherlands Nutrigenomics Center (NNC)
> Nutrition, Metabolism& Genomics Group
> Division of Human Nutrition
> PO Box 8129, 6700 EV Wageningen
> Visiting Address:
> "De Valk" ("Erfelijkheidsleer"),
> Building 304,
> Verbindingsweg 4, 6703 HC Wageningen
> Room: 0052a
> T: 0317 485786
> F: 0317 483342
> E-mail: Philip.deGroot at wur.nl
> I: http://humannutrition.wur.nl
> https://madmax.bioinformatics.nl
> http://www.nutrigenomicsconsortium.nl
>
>
>
> -----Original Message-----
> From: Osselaer, Steven [JRDBE Extern] [mailto:SOSSELAE at ITS.JNJ.COM]
> Sent: dinsdag 24 januari 2012 15:22
> To: James W. MacDonald
> Cc: Goehlmann, Hinrich [JRDBE]; bioconductor at r-project.org
> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>
> Thank you for this information, James.
> We will look into this and try start using 'oligo' for these types of arrays.
>
> Kind regards,
> Steven
>
> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
> Sent: Tuesday, 24 January 2012 15:18
> To: Osselaer, Steven [JRDBE Extern]
> Cc: Mike Smith; bioconductor at r-project.org
> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>
> Hi Steven,
>
> If you follow the discussion that Mike linked to, this has been corrected in the devel version of the affy package. I made this change because it didn't have an adverse effect on the intended target of the affy package, which is the 3' biased arrays. I have not made the change in the release version because it isn't a bug.
>
> I also made this change because people seem to want to use the affy package for analyzing the Gene ST chips even though it was never intended for this purpose, and doesn't really do a good job. The oligo package is intended to be used with these chips, and that is the package
>
> we recommend you use.
>
> I think some of the hesitation to use oligo stems from the fact that it had a long development cycle, and in earlier incarnations was not completely documented. This is no longer true, and I would recommend you
>
> at least take a look.
>
> Best,
>
> Jim
>
>
>
> On 1/24/2012 9:01 AM, Osselaer, Steven [JRDBE Extern] wrote:
>> Thanks a lot, Mike.
>>
>> Applying the patch makes the ReadAffy() call functional again for
> these
>> types of chips.
>>
>>
>>
>> Kind regards,
>>
>> Steven Osselaer
>>
>>
>>
>> From: Mike Smith [mailto:grimbough at gmail.com]
>> Sent: Tuesday, 24 January 2012 14:30
>> To: Osselaer, Steven [JRDBE Extern]
>> Cc: bioconductor at r-project.org
>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>
>>
>>
>> Hi Steven,
>>
>>
>>
>> I think this may be related to a problem that was raised on the
>> Bioc-devel mailing list a couple of months ago:
>>
>>
>>
>> https://stat.ethz.ch/pipermail/bioc-devel/2011-November/002955.html
>>
>>
>>
>> If indeed it's the same issue then the discussion above indicates it
> was
>> patch from affy version 1.33.1
>>
>>
>>
>> Mike
>>
>>
>>
>> On Tue, Jan 24, 2012 at 1:07 PM, Osselaer, Steven [JRDBE Extern]
>> <SOSSELAE at its.jnj.com> wrote:
>>
>> Dear Wolfgang,
>>
>> I was under the impression that it was a problem with the software as
> I
>> can read the same CEL files with the R 2.13.1 software : see
> transcript
>> for the same code but run under R 2.13.1 below.
>>
>> Kind regards,
>> Steven
>>
>> R version 2.13.1 (2011-07-08)
>>
>> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN
>> 3-900051-07-0
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>> You are welcome to redistribute it under certain conditions.
>> Type 'license()' or 'licence()' for distribution details.
>>
>> Natural language support but running in an English locale
>>
>> R is a collaborative project with many contributors.
>> Type 'contributors()' for more information and 'citation()' on how to
>> cite R or R packages in publications.
>>
>> Type 'demo()' for some demos, 'help()' for on-line help, or
>> 'help.start()' for an HTML browser interface to help.
>> Type 'q()' to quit R.
>>
>>> library(affy)
>> Loading required package: Biobase
>>
>> Welcome to Bioconductor
>>
>> Vignettes contain introductory material. To view, type
>> 'browseVignettes()'. To cite Bioconductor, see
>> 'citation("Biobase")' and for packages 'citation("pkgname")'.
>>
>>> sessionInfo()
>> R version 2.13.1 (2011-07-08)
>>
>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>
>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>>
>> [1] affy_1.30.0 Biobase_2.12.2
>>
>>
>> loaded via a namespace (and not attached):
>>
>> [1] affyio_1.20.0 preprocessCore_1.14.0
>>
>>> celFiles<- list.files(pattern="CEL$") celFiles
>> [1] "27002.CEL" "27003.CEL" "27004.CEL" "27005.CEL" "27006.CEL"
>> "27007.CEL"
>> [7] "27008.CEL" "27009.CEL" "27010.CEL" "27011.CEL" "27012.CEL"
>> "27013.CEL"
>> [13] "27014.CEL" "27015.CEL" "27016.CEL" "27017.CEL" "27018.CEL"
>> "27019.CEL"
>> [19] "27020.CEL" "27021.CEL" "27022.CEL" "27023.CEL" "27024.CEL"
>> "27025.CEL"
>> [25] "27026.CEL" "27027.CEL" "27028.CEL" "27029.CEL" "27030.CEL"
>> "27031.CEL"
>> [31] "27032.CEL" "27033.CEL" "27034.CEL" "27035.CEL" "27036.CEL"
>> "27037.CEL"
>> [37] "27038.CEL" "27039.CEL" "27040.CEL" "27041.CEL" "27042.CEL"
>> "27043.CEL"
>> [43] "27044.CEL" "27045.CEL" "27046.CEL" "27047.CEL" "27048.CEL"
>> "27049.CEL"
>> [49] "27050.CEL" "27051.CEL" "27052.CEL" "27053.CEL" "27054.CEL"
>> "27055.CEL"
>> [55] "27056.CEL" "27057.CEL" "27058.CEL" "27059.CEL" "27060.CEL"
>> "27061.CEL"
>> [61] "27062.CEL" "27063.CEL" "27064.CEL" "27065.CEL" "27066.CEL"
>> "27067.CEL"
>> [67] "27068.CEL" "27069.CEL" "27070.CEL" "27071.CEL" "27072.CEL"
>> "27073.CEL"
>> [73] "27074.CEL" "27075.CEL" "27076.CEL" "27077.CEL" "27078.CEL"
>> "27079.CEL"
>> [79] "27080.CEL" "27081.CEL" "27082.CEL" "27083.CEL" "27084.CEL"
>> "27085.CEL"
>> [85] "27086.CEL" "27087.CEL" "27088.CEL" "27089.CEL" "27090.CEL"
>> "27091.CEL"
>> [91] "27092.CEL" "27093.CEL" "27094.CEL" "27095.CEL" "27096.CEL"
>>> rawData<- ReadAffy(filenames=celFiles)
>>>
>>> q()
>>
>> -----Original Message-----
>> From: bioconductor-bounces at r-project.org
>> [mailto:bioconductor-bounces at r-project.org] On Behalf Of Wolfgang
> Huber
>> Sent: Tuesday, 24 January 2012 13:59
>> To: bioconductor at r-project.org
>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>
>>
>> Dear Steven
>>
>> thank you. What is your question, or why and how do you think someone
>> other than the party who gave you the apparently faulty CEL file can
>> help you?
>>
>> Best wishes
>> Wolfgang
>>
>>
>>
>>
>> Steven Osselaer [guest] scripsit 01/24/2012 10:42 AM:
>>> Reading HuGene-1_1-st-v1 CEL files results in an error message about
>> incorrect dimensions of the first CEL file of the list
>>> TRANSCRIPT :
>>>
>>> R version 2.14.1 (2011-12-22)
>>> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN
>>> 3-900051-07-0
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>>> You are welcome to redistribute it under certain conditions.
>>> Type 'license()' or 'licence()' for distribution details.
>>>
>>> Natural language support but running in an English locale
>>>
>>> R is a collaborative project with many contributors.
>>> Type 'contributors()' for more information and 'citation()' on how to
>>> cite R or R packages in publications.
>>>
>>> Type 'demo()' for some demos, 'help()' for on-line help, or
>>> 'help.start()' for an HTML browser interface to help.
>>> Type 'q()' to quit R.
>>>
>>>> library(affy)
>>> Loading required package: Biobase
>>>
>>> Welcome to Bioconductor
>>>
>>> Vignettes contain introductory material. To view, type
>>> 'browseVignettes()'. To cite Bioconductor, see
>>> 'citation("Biobase")' and for packages 'citation("pkgname")'.
>>>
>>>> sessionInfo()
>>> R version 2.14.1 (2011-12-22)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=C LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] affy_1.32.0 Biobase_2.14.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] affyio_1.22.0 BiocInstaller_1.2.1 preprocessCore_1.16.0
>>> [4] zlibbioc_1.0.0
>>>> celFiles<- list.files(pattern="CEL$") celFiles
>>> [1] "27002.CEL" "27003.CEL" "27004.CEL" "27005.CEL" "27006.CEL"
>> "27007.CEL"
>>> [7] "27008.CEL" "27009.CEL" "27010.CEL" "27011.CEL" "27012.CEL"
>> "27013.CEL"
>>> [13] "27014.CEL" "27015.CEL" "27016.CEL" "27017.CEL" "27018.CEL"
>> "27019.CEL"
>>> [19] "27020.CEL" "27021.CEL" "27022.CEL" "27023.CEL" "27024.CEL"
>> "27025.CEL"
>>> [25] "27026.CEL" "27027.CEL" "27028.CEL" "27029.CEL" "27030.CEL"
>> "27031.CEL"
>>> [31] "27032.CEL" "27033.CEL" "27034.CEL" "27035.CEL" "27036.CEL"
>> "27037.CEL"
>>> [37] "27038.CEL" "27039.CEL" "27040.CEL" "27041.CEL" "27042.CEL"
>> "27043.CEL"
>>> [43] "27044.CEL" "27045.CEL" "27046.CEL" "27047.CEL" "27048.CEL"
>> "27049.CEL"
>>> [49] "27050.CEL" "27051.CEL" "27052.CEL" "27053.CEL" "27054.CEL"
>> "27055.CEL"
>>> [55] "27056.CEL" "27057.CEL" "27058.CEL" "27059.CEL" "27060.CEL"
>> "27061.CEL"
>>> [61] "27062.CEL" "27063.CEL" "27064.CEL" "27065.CEL" "27066.CEL"
>> "27067.CEL"
>>> [67] "27068.CEL" "27069.CEL" "27070.CEL" "27071.CEL" "27072.CEL"
>> "27073.CEL"
>>> [73] "27074.CEL" "27075.CEL" "27076.CEL" "27077.CEL" "27078.CEL"
>> "27079.CEL"
>>> [79] "27080.CEL" "27081.CEL" "27082.CEL" "27083.CEL" "27084.CEL"
>> "27085.CEL"
>>> [85] "27086.CEL" "27087.CEL" "27088.CEL" "27089.CEL" "27090.CEL"
>> "27091.CEL"
>>> [91] "27092.CEL" "27093.CEL" "27094.CEL" "27095.CEL" "27096.CEL"
>>>> rawData<- ReadAffy(filenames=celFiles)
>>> Error in read.affybatch(filenames = l$filenames, phenoData =
>> l$phenoData, :
>>> Cel file 27002.CEL does not seem to have the correct dimensions
>>>> traceback()
>>> 3: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra,
>>> ref.cdfName, dim.intensity[c("Rows", "Cols")], verbose,
>> PACKAGE = "affyio")
>>> 2: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
>>> description = l$description, notes = notes, compress =
>> compress,
>>> rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra =
>> rm.extra,
>>> verbose = verbose, sd = sd, cdfname = cdfname)
>>> 1: ReadAffy(filenames = celFiles)
>>>
>>>
>>> -- output of sessionInfo():
>>>
>>> R version 2.14.1 (2011-12-22)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=C LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] affy_1.32.0 Biobase_2.14.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] affyio_1.22.0 BiocInstaller_1.2.1 preprocessCore_1.16.0
>>> [4] zlibbioc_1.0.0
>>>
>>>
>>> --
>>> Sent via the guest posting facility at bioconductor.org.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> --
>> Best wishes
>> Wolfgang
>>
>> Wolfgang Huber
>> EMBL
>> http://www.embl.de/research/units/genome_biology/huber
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>>
>>
>>
>>
>>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
>
>
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list