[BioC] RMA and justRMA error
aedin
aedin at jimmy.harvard.edu
Wed Aug 16 01:48:02 CEST 2006
Thanks Ben
Sorry I thought the same parser would apply to each method. I found the
culprit file using the approach you list below.
It was not obvious in any of the normal plots (hist, boxplot etc) as
only one probeset had a ridiculous value (it was 5.6 x10^14). This
would completely skew a mean but not a median.
Should I be wary of this cel file and dump it, or if it looks ok in the
hist, boxplot should I try to keep it? Do you know what would cause
this? How frequently does this occur?
Thanks for your help
Aedin
Ben Bolstad wrote:
>The parsing code does not necessarily detect all potential corruptions.
>And you will find that gcrma() will quite happily process the "corrupt"
>data I show below.
>
>The error itself is from the density() function. If you could isolate
>the array that is causing trouble using say something like this:
>
>for (i in 1:4){
>cat(i,"\n")
>blah <- bg.correct.rma(Dilution.Corrupted[,i])
>}
>
>The perhaps we could look at it a little closer.
>
>best,
>
>Ben
>
>
>
>On Tue, 2006-08-15 at 18:13 -0400, aedin wrote:
>
>
>>Dear Ben
>>Thanks for your reply. However if the data were corrupted, surely they
>>would not be read by ReadAffy and gcrma?
>>Aedin
>>
>>Ben Bolstad wrote:
>>
>>
>>>Typically, when I have encountered others who have had this error occur
>>>it is because they have corrupted data. For instance this piece of
>>>demonstration code will generate the same error:
>>>
>>>
>>>library(affy);library(affydata)
>>>data(Dilution)
>>>Dilution.Corrupted <- Dilution
>>>pm(Dilution.Corrupted)[1,1] <- 30000000
>>># that is an extreme value outside the
>>># range of normal raw probe intensities
>>>
>>>eset <- rma(Dilution.Corrupted)
>>>
>>>
>>>My suggestion would be to examine things along those lines.
>>>
>>>Best,
>>>
>>>Ben
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>On Tue, 2006-08-15 at 15:01 -0400, aedin wrote:
>>>
>>>
>>>
>>>>Dear BioC
>>>>I know that this error is reported a few times on the Bioc mailing list,
>>>>however no resolution to it is available in the archives (or at least
>>>>none that google and I could find). I get the same error whether I use
>>>>R 2.3.1 or the devel version. I enclose the devel version error.
>>>>
>>>>The cels files are read in by ReadAffy and are processed ok by gcrma,
>>>>however fall over when I try to run rma or justRMA.
>>>>
>>>>Thanks for your help
>>>>Aedin
>>>>
>>>> > df = justRMA(filenames=filenam[125:130])
>>>>Background correcting
>>>>Error in density.default(x, kernel = "epanechnikov", n = 2^14) :
>>>> need at least 2 points to select a bandwidth automatically
>>>>
>>>> > df = ReadAffy(filenames=filenam[125:130])
>>>> > df
>>>>AffyBatch object
>>>>size of arrays=1164x1164 features (63518 kb)
>>>>cdf=HG-U133_Plus_2 (54675 affyids)
>>>>number of samples=6
>>>>number of genes=54675
>>>>annotation=hgu133plus2
>>>>
>>>> > df.rma= rma(df)
>>>>Background correcting
>>>>Error in density.default(x, kernel = "epanechnikov", n = 2^14) :
>>>> need at least 2 points to select a bandwidth automatically
>>>>
>>>> > library(gcrma)
>>>> > df.gcrma= gcrma(df)
>>>>Adjusting for optical effect......Done.
>>>>Computing affinities.Done.
>>>>Adjusting for non-specific binding......Done.
>>>>Normalizing
>>>>Calculating Expression
>>>>
>>>> > sessionInfo()
>>>>R version 2.4.0 Under development (unstable) (2006-08-06 r38809)
>>>>i686-pc-linux-gnu
>>>>
>>>>locale:
>>>>LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>>
>>>>attached base packages:
>>>>[1] "splines" "tools" "methods" "stats" "graphics" "grDevices"
>>>>[7] "utils" "datasets" "base"
>>>>
>>>>other attached packages:
>>>>hgu133plus2probe hgu133plus2cdf gcrma matchprobes
>>>> "1.12.0" "1.12.0" "2.5.1" "1.5.0"
>>>> affy affyio Biobase made4
>>>> "1.11.6" "1.1.5" "1.11.24" "1.7.1"
>>>> scatterplot3d ade4
>>>> "0.3-24" "1.4-1"
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>:-)
>>
>>--
>>Aedín Culhane
>>Research Associate in Prof. J Quackenbush Lab
>>Harvard School of Public Health, Dana-Farber Cancer Institute
>>
>>
>>44 Binney Street, Mayer 232
>>Department of Biostatistics
>>Dana-Farber Cancer Institute
>>Boston, MA 02115
>>USA
>>
>>Phone: +1 (617) 632 2468
>>Fax: +1 (617) 632 5444
>>Email: aedin at jimmy.harvard.edu
>>Web URL: http://www.hsph.harvard.edu/researchers/aculhane.html
>>
>>
>>
>>
--
Aedín Culhane
Research Associate in Prof. J Quackenbush Lab
Harvard School of Public Health, Dana-Farber Cancer Institute
44 Binney Street, Mayer 232
Department of Biostatistics
Dana-Farber Cancer Institute
Boston, MA 02115
USA
Phone: +1 (617) 632 2468
Fax: +1 (617) 632 5444
Email: aedin at jimmy.harvard.edu
Web URL: http://www.hsph.harvard.edu/researchers/aculhane.html
More information about the Bioconductor
mailing list