[BioC] RMA and justRMA error

aedin aedin at jimmy.harvard.edu
Wed Aug 16 01:48:02 CEST 2006


Thanks Ben
Sorry I thought the same parser would apply to each method.  I found the 
culprit file using the approach you list below. 

It was not obvious in any of the normal plots (hist, boxplot etc) as 
only one probeset had a ridiculous value (it was 5.6 x10^14).  This 
would completely skew a mean but not a median. 

Should I be wary of this cel file and dump it, or if it looks ok in the 
hist, boxplot should I try to keep it?   Do you know what would cause 
this?  How frequently does this occur?

Thanks for your help
Aedin


Ben Bolstad wrote:

>The parsing code does not necessarily detect all potential corruptions.
>And you will find that gcrma() will quite happily process the "corrupt"
>data I show below.
>
>The error itself is from the density() function. If you could isolate
>the array that is causing trouble using say something like this:
>
>for (i in 1:4){
>cat(i,"\n")
>blah <- bg.correct.rma(Dilution.Corrupted[,i])
>}
>
>The perhaps we could look at it a little closer.
>
>best,
>
>Ben
>
>
>
>On Tue, 2006-08-15 at 18:13 -0400, aedin wrote:
>  
>
>>Dear Ben
>>Thanks for your reply. However if the data were corrupted, surely they
>>would not be read by ReadAffy and gcrma?
>>Aedin
>>
>>Ben Bolstad wrote: 
>>    
>>
>>>Typically, when I have encountered others who have had this error occur
>>>it is because they have corrupted data. For instance this piece of
>>>demonstration code will generate the same error:
>>>
>>>
>>>library(affy);library(affydata)
>>>data(Dilution)
>>>Dilution.Corrupted <- Dilution
>>>pm(Dilution.Corrupted)[1,1] <- 30000000  
>>># that is an extreme value outside the
>>># range of normal raw probe intensities
>>>
>>>eset <- rma(Dilution.Corrupted)
>>>
>>>
>>>My suggestion would be to examine things along those lines.
>>>
>>>Best,
>>>
>>>Ben
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>On Tue, 2006-08-15 at 15:01 -0400, aedin wrote:
>>>  
>>>      
>>>
>>>>Dear BioC
>>>>I know that this error is reported a few times on the Bioc mailing list, 
>>>>however no resolution to it is available in the archives (or at least 
>>>>none that google and I could find).  I get the same error whether I use 
>>>>R 2.3.1 or the devel version.  I enclose the devel version error.
>>>>
>>>>The cels files are read in by ReadAffy and are processed ok by gcrma, 
>>>>however fall over when I try to run rma or justRMA.
>>>>
>>>>Thanks for your help
>>>>Aedin
>>>>
>>>> > df = justRMA(filenames=filenam[125:130])
>>>>Background correcting
>>>>Error in density.default(x, kernel = "epanechnikov", n = 2^14) :
>>>>        need at least 2 points to select a bandwidth automatically
>>>>
>>>> > df = ReadAffy(filenames=filenam[125:130])
>>>> > df
>>>>AffyBatch object
>>>>size of arrays=1164x1164 features (63518 kb)
>>>>cdf=HG-U133_Plus_2 (54675 affyids)
>>>>number of samples=6
>>>>number of genes=54675
>>>>annotation=hgu133plus2
>>>>
>>>> > df.rma= rma(df)
>>>>Background correcting
>>>>Error in density.default(x, kernel = "epanechnikov", n = 2^14) :
>>>>        need at least 2 points to select a bandwidth automatically
>>>>
>>>> > library(gcrma)
>>>> > df.gcrma= gcrma(df)
>>>>Adjusting for optical effect......Done.
>>>>Computing affinities.Done.
>>>>Adjusting for non-specific binding......Done.
>>>>Normalizing
>>>>Calculating Expression
>>>>
>>>> > sessionInfo()
>>>>R version 2.4.0 Under development (unstable) (2006-08-06 r38809)
>>>>i686-pc-linux-gnu
>>>>
>>>>locale:
>>>>LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>
>>>>attached base packages:
>>>>[1] "splines"   "tools"     "methods"   "stats"     "graphics"  "grDevices"
>>>>[7] "utils"     "datasets"  "base"
>>>>
>>>>other attached packages:
>>>>hgu133plus2probe   hgu133plus2cdf            gcrma      matchprobes
>>>>        "1.12.0"         "1.12.0"          "2.5.1"          "1.5.0"
>>>>            affy           affyio          Biobase            made4
>>>>        "1.11.6"          "1.1.5"        "1.11.24"          "1.7.1"
>>>>   scatterplot3d             ade4
>>>>        "0.3-24"          "1.4-1"
>>>>    
>>>>        
>>>>
>>>  
>>>      
>>>
>>:-) 
>>
>>-- 
>>Aedín Culhane
>>Research Associate in Prof. J Quackenbush Lab
>>Harvard School of Public Health, Dana-Farber Cancer Institute
>>
>>
>>44 Binney Street, Mayer 232
>>Department of Biostatistics
>>Dana-Farber Cancer Institute
>>Boston, MA 02115
>>USA
>>
>>Phone: +1 (617) 632 2468
>>Fax:   +1 (617) 632 5444
>>Email: aedin at jimmy.harvard.edu
>>Web URL: http://www.hsph.harvard.edu/researchers/aculhane.html
>>
>>
>>    
>>


-- 
Aedín Culhane
Research Associate in Prof. J Quackenbush Lab
Harvard School of Public Health, Dana-Farber Cancer Institute


44 Binney Street, Mayer 232
Department of Biostatistics
Dana-Farber Cancer Institute
Boston, MA 02115
USA

Phone: +1 (617) 632 2468
Fax:   +1 (617) 632 5444
Email: aedin at jimmy.harvard.edu
Web URL: http://www.hsph.harvard.edu/researchers/aculhane.html



More information about the Bioconductor mailing list