[BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
James W. MacDonald
jmacdon at med.umich.edu
Mon Jan 30 16:21:04 CET 2012
Hi Ben,
Thanks for testing. I agree about the weird annotation, but it doesn't
come from name indexing.
The changes you and Kasper made are here (note this is the current code,
and the changes you made in Nov 2010 are extant in both release and devel):
return(new("AffyBatch",
exprs = exprs,
se.exprs = .Call("read_abatch_stddev",filenames, rm.mask,
rm.outliers, rm.extra, ref.cdfName,
dim.intensity,verbose, PACKAGE="affyio"),
cdfName = cdfname, ##cel at cdfName,
phenoData = phenoData,
nrow = dim.intensity[1],##["Rows"],
ncol = dim.intensity[2],##["Cols"],
annotation = cleancdfname(cdfname, addcdf=FALSE),
protocolData = protocol,
description= description,
notes = notes))
The change I made is
exprs <- .Call("read_abatch",filenames, rm.mask,
rm.outliers, rm.extra, ref.cdfName,
dim.intensity[c(1,2)],verbose, PACKAGE="affyio")
where originally the dim.intensity was indexed by c("Row","Col").
If we debug() read.affybatch(), we get this:
headdetails
$cdfName
[1] "HuGene-1_1-st-v1"
$`CEL dimensions`
Cols Rows
990 1190
Which indicates to me that perhaps when instantiating the AffyBatch,
ncol should be dim.intensity[1], and nrow should be dim.intensity[2], or
the opposite of the current code.
An example with these same HuGene 1.1 ST arrays:
> dat
AffyBatch object
size of arrays=1190x990 features (17 kb)
cdf=HuGene-1_1-st-v1 (33297 affyids)
number of samples=2
number of genes=33297
annotation=hugene11stv1
notes=
> nrow(dat)
Rows
1190
> ncol(dat)
Cols
990
> dim(dat)
Rows Cols
1190 990
> sessionInfo()
R Under development (unstable) (2011-08-04 r56624)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.iso885915 LC_NUMERIC=C
[3] LC_TIME=en_US.iso885915 LC_COLLATE=en_US.iso885915
[5] LC_MONETARY=en_US.iso885915 LC_MESSAGES=en_US.iso885915
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] hgfocuscdf_2.8.0 hugene11stv1cdf_2.4.0 AnnotationDbi_1.15.10
[4] affyPLM_1.29.2 preprocessCore_1.15.0 gcrma_2.25.2
[7] affy_1.32.1 Biobase_2.13.7
loaded via a namespace (and not attached):
[1] affyio_1.21.2 BiocInstaller_1.2.1 Biostrings_2.21.9
[4] DBI_0.2-5 IRanges_1.11.24 RSQLite_0.9-4
[7] splines_2.14.0 tools_2.14.0 zlibbioc_0.1.7
>
Best,
Jim
On 1/28/2012 9:21 PM, Ben Bolstad wrote:
> Hi Jim,
>
> Grabbing some data from GEO and testing, as far as I can tell the current affy 1.33.2 image() command is fine for HuGene 1.1 st, at least if you use the brainarray cdfenv.
>
> Similarly the affyPLM images look correctly oriented etc.
>
> One note, ncol() and nrow() on an AffyBatch object do look weirdly annotated in this context (one of the reasons I fought with the use of name indexing in my patch downthread).
>
> In any case, I don't see why the release branch could not be patched with this bug fix besides the strong coercive effects already discussed below. But then again, I'm not eating my own dog food much these days, so I'm not going to attempt it myself lest I make things worse.
>
> Best,
>
> Ben
>
>
>> Data<- ReadAffy("GSM801200_G032A_A09_3_JGRA2_P1.CEL.gz","GSM801203_G032A_B09_6_JNIX4_P2.CEL.gz",cdfname="hugene11stv1hsentrezgcdf")
>> Data
> AffyBatch object
> size of arrays=990x1190 features (17 kb)
> cdf=hugene11stv1hsentrezgcdf (19738 affyids)
> number of samples=2
> number of genes=19738
> annotation=hugene11stv1hsentrezgcdf
> notes=
>> nrow(Data)
> Cols
> 990
>> ncol(Data)
> Rows
> 1190
>> dim(data)
> NULL
>> dim(Data)
> Cols Rows
> 990 1190
>> sessionInfo()
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
> base
>
> other attached packages:
> [1] hugene11stv1hsentrezgcdf_14.1.0 affyPLM_1.30.0
> [3] preprocessCore_1.16.0 gcrma_2.26.0
> [5] affy_1.33.2 Biobase_2.14.0
> [7] BiocGenerics_0.1.4
>
> loaded via a namespace (and not attached):
> [1] affyio_1.22.0 BiocInstaller_1.2.1 Biostrings_2.22.0
> [4] IRanges_1.12.5 splines_2.14.1 tools_2.14.1
> [7] zlibbioc_1.0.0
>
>
> On Fri, 2012-01-27 at 13:32 -0500, James W. MacDonald wrote:
>> Hi Ben,
>>
>> I can't find anything in the archives corresponding to images being
>> wrong. The only two threads I can find are these:
>>
>> https://stat.ethz.ch/pipermail/bioc-devel/2010-May/002209.html
>> https://stat.ethz.ch/pipermail/bioc-devel/2010-April/002177.html
>>
>> But I wonder if there is still a problem.
>>
>> Will someone who has GeneTitan data please use either the devel (or
>> self-patched) version of affy to read in some chips, use fitPLM() in
>> affyPLM to summarize, and then see if the image() function works correctly?
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> On 1/27/2012 11:39 AM, Ben Bolstad wrote:
>>> I probably should not be responding, lest I say something stupid
>>> (though I agree that I am highly imperfect). Plus, I am away from both
>>> my email archives and any sort of machine that I can use to test
>>> things out which only exacerbates the chances of incorrectness.
>>>
>>> However, any changes that I made were most probably in response to an
>>> urgent user request to "make things work now". My vague recollections
>>> are something to do with making chip images, probably of the affyPLM
>>> variety (though perhaps image() on an AffyBatch as well) correct.
>>>
>>> In any case read.affybatch() calls the parser in affyio. The parser in
>>> affyio returns the chip dimensions in the order in which they are read
>>> from the CEL file. According to the documentation that Affymetrix
>>> places on their website (the documentation used to write the parser,
>>> at least past the point at which it was no longer reverse engineered
>>> from the old text CEL format) "Columns" are reported before "Rows" in
>>> the text and xda CEL formats (not to sure about generic "calvin"
>>> format right this minute). Based on that convention the function has
>>> always returned them in that order (though I agree this is
>>> non-conventional).
>>>
>>> It may be that CEL file "Rows" and "Columns" are sometimes transposed
>>> relative to CDF file "Rows" and "Columns" (and to make things worse
>>> Cols came before Rows in the text version of the CDF file, and Rows
>>> before Columns in the XDA CDF format). I don't recall off the top of
>>> my head whether the parser used by makecdfenv honors this properly.
>>>
>>>
>>> Ben
>>>
>>>
>>> On 27.01.2012 07:28, James W. MacDonald wrote:
>>>> Hi Philip,
>>>>
>>>> On 1/27/2012 3:18 AM, Groot, Philip de wrote:
>>>>> Dear James,
>>>>>
>>>>> I apologize for the email from yesterday. I totally agree with your
>>>>> points. In addition, I really appreciate the effort that is
>>>>> undertaken in establishing the oligo package. I am using the library
>>>>> myself regularly!
>>>> I am happy to hear that. Benilton has invested untold hours
>>>> developing oligo/pdInfoBuilder, and he deserves the appreciation.
>>>>
>>>>> However, there is a reason why I reacted this way. The affy problem
>>>>> has been reported by me previously. And it was fixed in Bioconductor
>>>>> 2.8! Now it is broken again. Can happen, but this line really
>>>>> annoyed me:
>>>>>
>>>>> (quote): " I have not made the change in the release version because
>>>>> it isn't a bug."
>>>> The definition of a bug is when software doesn't do something that
>>>> the author intends it to do. I and others have stated numerous times
>>>> that the affy package is not and never was intended for use with any
>>>> chip type but the 3'-biased chips. You could just as easily declare
>>>> that is a bug that affy won't process SNP chips, and it would make as
>>>> much sense.
>>>>
>>>>> Definitely, it IS a bug. However, it does not affect analysis of
>>>>> Affymetrix arrays because the 1st and 2nd generation arrays are
>>>>> square. So you don't notice the problem and this is fine. This is
>>>>> also the reason why I am in doubt whether the fix will really stay
>>>>> in with the next release. It has been removed before without good
>>>>> reason...
>>>> First you apologize, and then you insult...
>>>>
>>>> Here again your ignorance of the process has caused you to stray into
>>>> dangerous lands. It seems to me you want to assume treachery or
>>>> collusion when the most likely cause is well-intentioned error. To
>>>> disabuse you of this notion, let's look at the svn logs, shall we?
>>>>
>>>> In May 2010, Kasper Daniel Hansen made some changes:
>>>>
>>>> svn log -r 47059
>>>>
>>>> ------------------------------------------------------------------------
>>>> r47059 | khansen | 2010-05-20 20:21:16 -0400 (Thu, 20 May 2010) | 1 line
>>>>
>>>> Uses names from the output of read.celfile.header in read.affybatch;
>>>> recent addition to affyio. This fixes a bug in read.affybatch having
>>>> to do with non-square arrays
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> The changes he made:
>>>>
>>>> svn diff -r 47060:47050 read.affybatch.R
>>>> Index: read.affybatch.R
>>>> ===================================================================
>>>> --- read.affybatch.R (revision 47060)
>>>> +++ read.affybatch.R (revision 47050)
>>>> @@ -111,8 +111,8 @@
>>>> ##se.exprs = array(NaN, dim=dim.sd),
>>>> cdfName = cdfname, ##cel at cdfName,
>>>> phenoData = phenoData,
>>>> - nrow = dim.intensity["Rows"],
>>>> - ncol = dim.intensity["Cols"],
>>>> + nrow = dim.intensity[1],
>>>> + ncol = dim.intensity[2],
>>>> annotation = cleancdfname(cdfname, addcdf=FALSE),
>>>> protocolData = protocol,
>>>> description= description,
>>>>
>>>> Changing the subsetting of dim.intensity to use "Rows" and "Cols"
>>>> rather than 1/2.
>>>>
>>>> In November 2010, Ben Bolstad made some changes:
>>>>
>>>> svn log -r 50736
>>>>
>>>> ------------------------------------------------------------------------
>>>> r50736 | bolstad | 2010-11-07 00:48:33 -0400 (Sun, 07 Nov 2010) | 2
>>>> lines
>>>>
>>>> fixes to handle non square arrays
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>> The changes he made were
>>>>
>>>> svn diff -r 50730:50740 read.affybatch.R
>>>> Index: read.affybatch.R
>>>> ===================================================================
>>>> --- read.affybatch.R (revision 50730)
>>>> +++ read.affybatch.R (revision 50740)
>>>> @@ -111,8 +111,8 @@
>>>> ##se.exprs = array(NaN, dim=dim.sd),
>>>> cdfName = cdfname, ##cel at cdfName,
>>>> phenoData = phenoData,
>>>> - nrow = dim.intensity["Rows"],
>>>> - ncol = dim.intensity["Cols"],
>>>> + nrow = dim.intensity[1],##["Rows"],
>>>> + ncol = dim.intensity[2],##["Cols"],
>>>> annotation = cleancdfname(cdfname, addcdf=FALSE),
>>>> protocolData = protocol,
>>>> description= description,
>>>>
>>>> Where he changed the code back to the original version.
>>>>
>>>> Then in November 2011, yours truly made some changes:
>>>>
>>>> svn log -r 60183
>>>>
>>>> ------------------------------------------------------------------------
>>>> r60183 | jmacdon | 2011-11-10 09:24:59 -0500 (Thu, 10 Nov 2011) | 1 line
>>>>
>>>> Modifications to read.affybatch() to allow reading of non-square arrays
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> And those changes were
>>>>
>>>> svn diff -r 60185:60180 read.affybatch.R
>>>> Index: read.affybatch.R
>>>> ===================================================================
>>>> --- read.affybatch.R (revision 60185)
>>>> +++ read.affybatch.R (revision 60180)
>>>> @@ -100,7 +100,7 @@
>>>>
>>>> exprs<- .Call("read_abatch",filenames, rm.mask,
>>>> rm.outliers, rm.extra, ref.cdfName,
>>>> - dim.intensity[c(1,2)],verbose, PACKAGE="affyio")
>>>> + dim.intensity[c("Rows","Cols")],verbose,
>>>> PACKAGE="affyio")
>>>> colnames(exprs)<- samplenames
>>>>
>>>> where we go back to subsetting dim.intensity by "Row" and "Col".
>>>>
>>>>
>>>> Here we have three instances of three different people trying to make
>>>> sure the affy package will work with the non-square arrays. One
>>>> attempt had the unintended effect of unfixing a previous fix. This is
>>>> unfortunate, but given the imperfection of the human species, and the
>>>> fact that multiple people have write access for this package, mistakes
>>>> will be made.
>>>>
>>>> There is, however, no evidence of capriciousness nor ill will towards
>>>> those unlucky souls trying to analyze non-square arrays with the affy
>>>> package.
>>>>
>>>>
>>>>> In addition, people have invested some time to properly create CDF's
>>>>> for the geneTitan plates that do properly work with the affy library
>>>>> and provide (at least) identical RMA results with oligo. To my
>>>>> opinion, the great success and support of Bioconductor is for a
>>>>> significant part based on the affy library and the solutions that it
>>>>> offered when microarray analysis was at its infancy: it contributed
>>>>> in evolving Bioconductor to its current state! I think that the
>>>>> Bioconductor project should allow a "transitional period" where both
>>>>> affy and oligo can be utilized for analysing the most recent
>>>>> Affymetrix arrays. In addition, a lot of publications and tutorials
>>>>> are available that point people to the affy library and hence
>>>>> stimulate people to try it in the first place! Eventually, we should
>>>>> use oligo. No doubt about it, but the process should be a smooth
>>>>> transition. Currently, this is not the case. In addition, I am
>>>>> trying to help and I have the feeling that this is not well appr!
>>>> ec!
>>>>> iated.
>>>> I see, you are trying to help. Good for you. I wonder exactly what
>>>> you are doing other than complaining bitterly about honest mistakes
>>>> and casting aspersions on people whom you seem to think have done you
>>>> harm. What ever it is, keep up the good work.
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>> In summary: the oligo library has my full support, but I do hope
>>>>> that the affy-issue will be fixed because it is a good thing for the
>>>>> Bioconductor community.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Dr. Philip de Groot
>>>>> Bioinformatician / Microarray analysis expert
>>>>>
>>>>> Wageningen University / TIFN
>>>>> Netherlands Nutrigenomics Center (NNC)
>>>>> Nutrition, Metabolism& Genomics Group
>>>>> Division of Human Nutrition
>>>>> PO Box 8129, 6700 EV Wageningen
>>>>> Visiting Address:
>>>>> "De Valk" ("Erfelijkheidsleer"),
>>>>> Building 304,
>>>>> Verbindingsweg 4, 6703 HC Wageningen
>>>>> Room: 0052a
>>>>> T: 0317 485786
>>>>> F: 0317 483342
>>>>> E-mail: Philip.deGroot at wur.nl
>>>>> I: http://humannutrition.wur.nl
>>>>> https://madmax.bioinformatics.nl
>>>>> http://www.nutrigenomicsconsortium.nl
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
>>>>> Sent: donderdag 26 januari 2012 15:32
>>>>> To: Groot, Philip de
>>>>> Cc: 'Osselaer, Steven [JRDBE Extern]'; Goehlmann, Hinrich [JRDBE];
>>>>> bioconductor at r-project.org
>>>>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>>>>
>>>>> Hi Philip,
>>>>>
>>>>> On 1/26/2012 4:28 AM, Groot, Philip de wrote:
>>>>>> Hello all,
>>>>>>
>>>>>> Just to be sure:
>>>>>>
>>>>>>> If you follow the discussion that Mike linked to, this has been
>>>>>>> corrected in the devel version of the affy package. I made this
>>>>>>> change because>it didn't have an adverse effect on the intended
>>>>>>> target of the affy package, which is the 3' biased arrays. I have
>>>>>>> not made the change in the>release version because it isn't a bug.
>>>>>> I think it is not nice that the problem will reoccur everytime a
>>>>>> new release is present? So I do hope that the patch is included in
>>>>>> the next Bioconductor release? Please acknowledge!
>>>>> For those who don't understand how the BioC release cycle works,
>>>>> here is a short primer. At any one time there are two versions; the
>>>>> release version, which is considered to be stable, and the devel
>>>>> version, upon which developers are still working.
>>>>>
>>>>> At each release the developers finish up all changes they have made
>>>>> to their packages, and the devel version is then split into a new
>>>>> release branch, which is then 'released'. This new release is then
>>>>> considered to be stable, and only bug fixes of sufficient gravity
>>>>> can be made. Since this patch doesn't fix a bug, it was not applied
>>>>> to the release version.
>>>>>
>>>>> Therefore, by definition, all changes made to code in the devel
>>>>> version will make their way into the next release.
>>>>>
>>>>>> In addition, I severely tested affy and oligo RMA normalization
>>>>>> using either the CDF (http://nmg-r.bioinformatics.nl/NuGO_R.html)
>>>>>> or the pd.mapping (Bioconductor oligo) libraries. The RMA results
>>>>>> are identical upon to last digit!
>>>>>>
>>>>>> In conclusion: it works in both ways, so let's support it properly
>>>>>> then! Note: I do agree that the oligo package is better suited for
>>>>>> handling 3rd generation Affymetrix arrays, but intentionally
>>>>>> sabotaging the affy library ((sorry, but it just looks like this)
>>>>>> is not the way to force people to move to oligo. Just my 2 cents.
>>>>> That is a pretty harsh condemnation, and I will assume that you
>>>>> don't really mean it like it sounds, so I will try to show restraint.
>>>>>
>>>>> A little background; several years ago it became clear that
>>>>> Affymetrix was going to have many more types of chips than just the
>>>>> original 3'
>>>>> biased chips for which the affy/makecdfenv pipeline was developed.
>>>>> After some discussion, Rafael Irizarry decided that rather than
>>>>> trying to reverse engineer an already existing and popular package
>>>>> to support all these new chips (in the six month span between
>>>>> releases), it would be better to create an entirely new pipeline
>>>>> that is intended to support ALL chips that Affy produces. The amount
>>>>> of time it took to get oligo/pdInfoBuilder to the current matured
>>>>> state is testament to the wisdom of that choice. Trying to 'fix'
>>>>> affy in six months would have been a disaster.
>>>>>
>>>>> So, three points;
>>>>>
>>>>> 1.) Characterizing this as sabotage is (arrogant, ignorant, foolish,
>>>>> infuriating). I leave it to others to decide which.
>>>>> 2.) The affy and makecdfenv packages are open source. If you (or
>>>>> anybody else, for that matter) wants to fork the code into your own
>>>>> package that supports all and sundry, please feel free to do so.
>>>>> 3.) The original plan was for the affy package to be deprecated, and
>>>>> then removed from BioC. In deference to the vast user base who use
>>>>> this package, and the existing personal code that is based on affy,
>>>>> it was not deprecated. In addition, we have made changes where we
>>>>> can to make affy accomodate these new chips, even when it isn't in
>>>>> anybody's interest to do so. This, I believe, invalidates your
>>>>> accusation that people are being 'forced to move to oligo'.
>>>>>
>>>>> Best,
>>>>>
>>>>> Jim
>>>>>
>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Dr. Philip de Groot
>>>>>> Bioinformatician / Microarray analysis expert
>>>>>>
>>>>>> Wageningen University / TIFN
>>>>>> Netherlands Nutrigenomics Center (NNC) Nutrition, Metabolism&
>>>>>> Genomics
>>>>>> Group Division of Human Nutrition PO Box 8129, 6700 EV Wageningen
>>>>>> Visiting Address:
>>>>>> "De Valk" ("Erfelijkheidsleer"),
>>>>>> Building 304,
>>>>>> Verbindingsweg 4, 6703 HC Wageningen
>>>>>> Room: 0052a
>>>>>> T: 0317 485786
>>>>>> F: 0317 483342
>>>>>> E-mail: Philip.deGroot at wur.nl
>>>>>> I: http://humannutrition.wur.nl
>>>>>> https://madmax.bioinformatics.nl
>>>>>> http://www.nutrigenomicsconsortium.nl
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Osselaer, Steven [JRDBE Extern] [mailto:SOSSELAE at ITS.JNJ.COM]
>>>>>> Sent: dinsdag 24 januari 2012 15:22
>>>>>> To: James W. MacDonald
>>>>>> Cc: Goehlmann, Hinrich [JRDBE]; bioconductor at r-project.org
>>>>>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>>>>>
>>>>>> Thank you for this information, James.
>>>>>> We will look into this and try start using 'oligo' for these types
>>>>>> of arrays.
>>>>>>
>>>>>> Kind regards,
>>>>>> Steven
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
>>>>>> Sent: Tuesday, 24 January 2012 15:18
>>>>>> To: Osselaer, Steven [JRDBE Extern]
>>>>>> Cc: Mike Smith; bioconductor at r-project.org
>>>>>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>>>>>
>>>>>> Hi Steven,
>>>>>>
>>>>>> If you follow the discussion that Mike linked to, this has been
>>>>>> corrected in the devel version of the affy package. I made this
>>>>>> change because it didn't have an adverse effect on the intended
>>>>>> target of the affy package, which is the 3' biased arrays. I have
>>>>>> not made the change in the release version because it isn't a bug.
>>>>>>
>>>>>> I also made this change because people seem to want to use the affy
>>>>>> package for analyzing the Gene ST chips even though it was never
>>>>>> intended for this purpose, and doesn't really do a good job. The oligo
>>>>>> package is intended to be used with these chips, and that is the
>>>>>> package
>>>>>>
>>>>>> we recommend you use.
>>>>>>
>>>>>> I think some of the hesitation to use oligo stems from the fact that
>>>>>> it had a long development cycle, and in earlier incarnations was not
>>>>>> completely documented. This is no longer true, and I would recommend
>>>>>> you
>>>>>>
>>>>>> at least take a look.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 1/24/2012 9:01 AM, Osselaer, Steven [JRDBE Extern] wrote:
>>>>>>> Thanks a lot, Mike.
>>>>>>>
>>>>>>> Applying the patch makes the ReadAffy() call functional again for
>>>>>> these
>>>>>>> types of chips.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Kind regards,
>>>>>>>
>>>>>>> Steven Osselaer
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From: Mike Smith [mailto:grimbough at gmail.com]
>>>>>>> Sent: Tuesday, 24 January 2012 14:30
>>>>>>> To: Osselaer, Steven [JRDBE Extern]
>>>>>>> Cc: bioconductor at r-project.org
>>>>>>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Steven,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think this may be related to a problem that was raised on the
>>>>>>> Bioc-devel mailing list a couple of months ago:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> https://stat.ethz.ch/pipermail/bioc-devel/2011-November/002955.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> If indeed it's the same issue then the discussion above indicates it
>>>>>> was
>>>>>>> patch from affy version 1.33.1
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jan 24, 2012 at 1:07 PM, Osselaer, Steven [JRDBE Extern]
>>>>>>> <SOSSELAE at its.jnj.com> wrote:
>>>>>>>
>>>>>>> Dear Wolfgang,
>>>>>>>
>>>>>>> I was under the impression that it was a problem with the software as
>>>>>> I
>>>>>>> can read the same CEL files with the R 2.13.1 software : see
>>>>>> transcript
>>>>>>> for the same code but run under R 2.13.1 below.
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> Steven
>>>>>>>
>>>>>>> R version 2.13.1 (2011-07-08)
>>>>>>>
>>>>>>> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN
>>>>>>> 3-900051-07-0
>>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>>
>>>>>>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>>>>>>> You are welcome to redistribute it under certain conditions.
>>>>>>> Type 'license()' or 'licence()' for distribution details.
>>>>>>>
>>>>>>> Natural language support but running in an English locale
>>>>>>>
>>>>>>> R is a collaborative project with many contributors.
>>>>>>> Type 'contributors()' for more information and 'citation()' on how to
>>>>>>> cite R or R packages in publications.
>>>>>>>
>>>>>>> Type 'demo()' for some demos, 'help()' for on-line help, or
>>>>>>> 'help.start()' for an HTML browser interface to help.
>>>>>>> Type 'q()' to quit R.
>>>>>>>
>>>>>>>> library(affy)
>>>>>>> Loading required package: Biobase
>>>>>>>
>>>>>>> Welcome to Bioconductor
>>>>>>>
>>>>>>> Vignettes contain introductory material. To view, type
>>>>>>> 'browseVignettes()'. To cite Bioconductor, see
>>>>>>> 'citation("Biobase")' and for packages 'citation("pkgname")'.
>>>>>>>
>>>>>>>> sessionInfo()
>>>>>>> R version 2.13.1 (2011-07-08)
>>>>>>>
>>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>>
>>>>>>> locale:
>>>>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>>>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>>>>>>
>>>>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>>>>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>>>>>>
>>>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>>>
>>>>>>> attached base packages:
>>>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>>>
>>>>>>> other attached packages:
>>>>>>>
>>>>>>> [1] affy_1.30.0 Biobase_2.12.2
>>>>>>>
>>>>>>>
>>>>>>> loaded via a namespace (and not attached):
>>>>>>>
>>>>>>> [1] affyio_1.20.0 preprocessCore_1.14.0
>>>>>>>
>>>>>>>> celFiles<- list.files(pattern="CEL$") celFiles
>>>>>>> [1] "27002.CEL" "27003.CEL" "27004.CEL" "27005.CEL" "27006.CEL"
>>>>>>> "27007.CEL"
>>>>>>> [7] "27008.CEL" "27009.CEL" "27010.CEL" "27011.CEL" "27012.CEL"
>>>>>>> "27013.CEL"
>>>>>>> [13] "27014.CEL" "27015.CEL" "27016.CEL" "27017.CEL" "27018.CEL"
>>>>>>> "27019.CEL"
>>>>>>> [19] "27020.CEL" "27021.CEL" "27022.CEL" "27023.CEL" "27024.CEL"
>>>>>>> "27025.CEL"
>>>>>>> [25] "27026.CEL" "27027.CEL" "27028.CEL" "27029.CEL" "27030.CEL"
>>>>>>> "27031.CEL"
>>>>>>> [31] "27032.CEL" "27033.CEL" "27034.CEL" "27035.CEL" "27036.CEL"
>>>>>>> "27037.CEL"
>>>>>>> [37] "27038.CEL" "27039.CEL" "27040.CEL" "27041.CEL" "27042.CEL"
>>>>>>> "27043.CEL"
>>>>>>> [43] "27044.CEL" "27045.CEL" "27046.CEL" "27047.CEL" "27048.CEL"
>>>>>>> "27049.CEL"
>>>>>>> [49] "27050.CEL" "27051.CEL" "27052.CEL" "27053.CEL" "27054.CEL"
>>>>>>> "27055.CEL"
>>>>>>> [55] "27056.CEL" "27057.CEL" "27058.CEL" "27059.CEL" "27060.CEL"
>>>>>>> "27061.CEL"
>>>>>>> [61] "27062.CEL" "27063.CEL" "27064.CEL" "27065.CEL" "27066.CEL"
>>>>>>> "27067.CEL"
>>>>>>> [67] "27068.CEL" "27069.CEL" "27070.CEL" "27071.CEL" "27072.CEL"
>>>>>>> "27073.CEL"
>>>>>>> [73] "27074.CEL" "27075.CEL" "27076.CEL" "27077.CEL" "27078.CEL"
>>>>>>> "27079.CEL"
>>>>>>> [79] "27080.CEL" "27081.CEL" "27082.CEL" "27083.CEL" "27084.CEL"
>>>>>>> "27085.CEL"
>>>>>>> [85] "27086.CEL" "27087.CEL" "27088.CEL" "27089.CEL" "27090.CEL"
>>>>>>> "27091.CEL"
>>>>>>> [91] "27092.CEL" "27093.CEL" "27094.CEL" "27095.CEL" "27096.CEL"
>>>>>>>> rawData<- ReadAffy(filenames=celFiles)
>>>>>>>>
>>>>>>>> q()
>>>>>>> -----Original Message-----
>>>>>>> From: bioconductor-bounces at r-project.org
>>>>>>> [mailto:bioconductor-bounces at r-project.org] On Behalf Of Wolfgang
>>>>>> Huber
>>>>>>> Sent: Tuesday, 24 January 2012 13:59
>>>>>>> To: bioconductor at r-project.org
>>>>>>> Subject: Re: [BioC] affy : ReadAffy() fails on HuGene-1_1-st-v1 chips
>>>>>>>
>>>>>>>
>>>>>>> Dear Steven
>>>>>>>
>>>>>>> thank you. What is your question, or why and how do you think someone
>>>>>>> other than the party who gave you the apparently faulty CEL file can
>>>>>>> help you?
>>>>>>>
>>>>>>> Best wishes
>>>>>>> Wolfgang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Steven Osselaer [guest] scripsit 01/24/2012 10:42 AM:
>>>>>>>> Reading HuGene-1_1-st-v1 CEL files results in an error message about
>>>>>>> incorrect dimensions of the first CEL file of the list
>>>>>>>> TRANSCRIPT :
>>>>>>>>
>>>>>>>> R version 2.14.1 (2011-12-22)
>>>>>>>> Copyright (C) 2011 The R Foundation for Statistical Computing ISBN
>>>>>>>> 3-900051-07-0
>>>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>>>
>>>>>>>> R is free software and comes with ABSOLUTELY NO WARRANTY.
>>>>>>>> You are welcome to redistribute it under certain conditions.
>>>>>>>> Type 'license()' or 'licence()' for distribution details.
>>>>>>>>
>>>>>>>> Natural language support but running in an English locale
>>>>>>>>
>>>>>>>> R is a collaborative project with many contributors.
>>>>>>>> Type 'contributors()' for more information and 'citation()' on how
>>>>>>>> to cite R or R packages in publications.
>>>>>>>>
>>>>>>>> Type 'demo()' for some demos, 'help()' for on-line help, or
>>>>>>>> 'help.start()' for an HTML browser interface to help.
>>>>>>>> Type 'q()' to quit R.
>>>>>>>>
>>>>>>>>> library(affy)
>>>>>>>> Loading required package: Biobase
>>>>>>>>
>>>>>>>> Welcome to Bioconductor
>>>>>>>>
>>>>>>>> Vignettes contain introductory material. To view, type
>>>>>>>> 'browseVignettes()'. To cite Bioconductor, see
>>>>>>>> 'citation("Biobase")' and for packages 'citation("pkgname")'.
>>>>>>>>
>>>>>>>>> sessionInfo()
>>>>>>>> R version 2.14.1 (2011-12-22)
>>>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>>>
>>>>>>>> locale:
>>>>>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>>>>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>>>>>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>>>>>>> [7] LC_PAPER=C LC_NAME=C
>>>>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>>>>
>>>>>>>> attached base packages:
>>>>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>>>>
>>>>>>>> other attached packages:
>>>>>>>> [1] affy_1.32.0 Biobase_2.14.0
>>>>>>>>
>>>>>>>> loaded via a namespace (and not attached):
>>>>>>>> [1] affyio_1.22.0 BiocInstaller_1.2.1
>>>>>>>> preprocessCore_1.16.0
>>>>>>>> [4] zlibbioc_1.0.0
>>>>>>>>> celFiles<- list.files(pattern="CEL$") celFiles
>>>>>>>> [1] "27002.CEL" "27003.CEL" "27004.CEL" "27005.CEL" "27006.CEL"
>>>>>>> "27007.CEL"
>>>>>>>> [7] "27008.CEL" "27009.CEL" "27010.CEL" "27011.CEL" "27012.CEL"
>>>>>>> "27013.CEL"
>>>>>>>> [13] "27014.CEL" "27015.CEL" "27016.CEL" "27017.CEL" "27018.CEL"
>>>>>>> "27019.CEL"
>>>>>>>> [19] "27020.CEL" "27021.CEL" "27022.CEL" "27023.CEL" "27024.CEL"
>>>>>>> "27025.CEL"
>>>>>>>> [25] "27026.CEL" "27027.CEL" "27028.CEL" "27029.CEL" "27030.CEL"
>>>>>>> "27031.CEL"
>>>>>>>> [31] "27032.CEL" "27033.CEL" "27034.CEL" "27035.CEL" "27036.CEL"
>>>>>>> "27037.CEL"
>>>>>>>> [37] "27038.CEL" "27039.CEL" "27040.CEL" "27041.CEL" "27042.CEL"
>>>>>>> "27043.CEL"
>>>>>>>> [43] "27044.CEL" "27045.CEL" "27046.CEL" "27047.CEL" "27048.CEL"
>>>>>>> "27049.CEL"
>>>>>>>> [49] "27050.CEL" "27051.CEL" "27052.CEL" "27053.CEL" "27054.CEL"
>>>>>>> "27055.CEL"
>>>>>>>> [55] "27056.CEL" "27057.CEL" "27058.CEL" "27059.CEL" "27060.CEL"
>>>>>>> "27061.CEL"
>>>>>>>> [61] "27062.CEL" "27063.CEL" "27064.CEL" "27065.CEL" "27066.CEL"
>>>>>>> "27067.CEL"
>>>>>>>> [67] "27068.CEL" "27069.CEL" "27070.CEL" "27071.CEL" "27072.CEL"
>>>>>>> "27073.CEL"
>>>>>>>> [73] "27074.CEL" "27075.CEL" "27076.CEL" "27077.CEL" "27078.CEL"
>>>>>>> "27079.CEL"
>>>>>>>> [79] "27080.CEL" "27081.CEL" "27082.CEL" "27083.CEL" "27084.CEL"
>>>>>>> "27085.CEL"
>>>>>>>> [85] "27086.CEL" "27087.CEL" "27088.CEL" "27089.CEL" "27090.CEL"
>>>>>>> "27091.CEL"
>>>>>>>> [91] "27092.CEL" "27093.CEL" "27094.CEL" "27095.CEL" "27096.CEL"
>>>>>>>>> rawData<- ReadAffy(filenames=celFiles)
>>>>>>>> Error in read.affybatch(filenames = l$filenames, phenoData =
>>>>>>> l$phenoData, :
>>>>>>>> Cel file 27002.CEL does not seem to have the correct
>>>>>>>> dimensions
>>>>>>>>> traceback()
>>>>>>>> 3: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra,
>>>>>>>> ref.cdfName, dim.intensity[c("Rows", "Cols")], verbose,
>>>>>>> PACKAGE = "affyio")
>>>>>>>> 2: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
>>>>>>>> description = l$description, notes = notes, compress =
>>>>>>> compress,
>>>>>>>> rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra =
>>>>>>> rm.extra,
>>>>>>>> verbose = verbose, sd = sd, cdfname = cdfname)
>>>>>>>> 1: ReadAffy(filenames = celFiles)
>>>>>>>>
>>>>>>>>
>>>>>>>> -- output of sessionInfo():
>>>>>>>>
>>>>>>>> R version 2.14.1 (2011-12-22)
>>>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>>>
>>>>>>>> locale:
>>>>>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>>>>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>>>>>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>>>>>>> [7] LC_PAPER=C LC_NAME=C
>>>>>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>>>>
>>>>>>>> attached base packages:
>>>>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>>>>
>>>>>>>> other attached packages:
>>>>>>>> [1] affy_1.32.0 Biobase_2.14.0
>>>>>>>>
>>>>>>>> loaded via a namespace (and not attached):
>>>>>>>> [1] affyio_1.22.0 BiocInstaller_1.2.1
>>>>>>>> preprocessCore_1.16.0
>>>>>>>> [4] zlibbioc_1.0.0
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Sent via the guest posting facility at bioconductor.org.
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioconductor mailing list
>>>>>>>> Bioconductor at r-project.org
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best wishes
>>>>>>> Wolfgang
>>>>>>>
>>>>>>> Wolfgang Huber
>>>>>>> EMBL
>>>>>>> http://www.embl.de/research/units/genome_biology/huber
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> James W. MacDonald, M.S.
>>>>>> Biostatistician
>>>>>> Douglas Lab
>>>>>> University of Michigan
>>>>>> Department of Human Genetics
>>>>>> 5912 Buhl
>>>>>> 1241 E. Catherine St.
>>>>>> Ann Arbor MI 48109-5618
>>>>>> 734-615-7826
>>>>>>
>>>>>> **********************************************************
>>>>>> Electronic Mail is not secure, may not be read every day, and should
>>>>>> not be used for urgent or sensitive issues
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> James W. MacDonald, M.S.
>>>>> Biostatistician
>>>>> Douglas Lab
>>>>> University of Michigan
>>>>> Department of Human Genetics
>>>>> 5912 Buhl
>>>>> 1241 E. Catherine St.
>>>>> Ann Arbor MI 48109-5618
>>>>> 734-615-7826
>>>>>
>>>>> **********************************************************
>>>>> Electronic Mail is not secure, may not be read every day, and should
>>>>> not be used for urgent or sensitive issues
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> --
>>>> James W. MacDonald, M.S.
>>>> Biostatistician
>>>> Douglas Lab
>>>> University of Michigan
>>>> Department of Human Genetics
>>>> 5912 Buhl
>>>> 1241 E. Catherine St.
>>>> Ann Arbor MI 48109-5618
>>>> 734-615-7826
>>>>
>>>>
>>>> **********************************************************
>>>> Electronic Mail is not secure, may not be read every day, and should
>>>> not be used for urgent or sensitive issues
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list