[BioC] removing outlier/masked probes and gcrma

Jenny Drnevich drnevich at uiuc.edu
Tue Jan 16 16:45:06 CET 2007


Hi Andrew,


>... but it surprises me somewhat that there isn't an alternate solution.
>First, what do people do with an AffyBatch object which was read in
>using the rm.mask option if it can't be used for further analyses?  (Or
>is this a failing in how gcrma specifically deals with NAs?)  And
>second, although custom CDFs would be great for dealing with
>ChipType-specific effects (e.g., SNPs), how do people deal with
>chip-specific effects (e.g., scratches and debris)?

The short answer is, for Affymetrix expression arrays, we don't worry about 
scratches and debris. If there are only a few blemishes, the specific 
probes affected are likely to belong to completely different probesets. 
Most summarization methods calculate a probeset's value "robustly", meaning 
they down-weight or ignore an outlier probe, so most scratches and debris 
shouldn't have much effect on the resulting probeset values. At our 
facility, if the chip blemishes are > 10% of the array, we rerun the sample 
on another chip. Affy's reasonably good to us in replacing these 
'defective' arrays free of charge.

If you just want to remove specific probes and/or probesets from all the 
arrays, the easiest way is likely the 'RemoveProbes' function that Amy 
mentioned in her response to you. However, if you want to consistently make 
this change and you'll be doing lots of arrays, then it might be better to 
take the time to make the custom CDF.

Cheers,
Jenny



>  Just a couple of
>thoughts...  Any additional ideas are welcome, but we'll be pushing
>ahead on custom CDFs in the mean time...
>
>Cheers,
>-andrew
>
>
>
>-----Original Message-----
>From: James W. MacDonald [mailto:jmacdon at med.umich.edu]
>Sent: Saturday, January 13, 2007 6:41 AM
>To: Andrew Su
>Cc: bioconductor at stat.math.ethz.ch
>Subject: Re: [BioC] removing outlier/masked probes and gcrma
>
>Hi Andrew,
>
>Andrew Su wrote:
> > I am attempting to use gcrma on AffyBatch objects which were read in
> > using the "rm.outliers=TRUE" or "rm.mask=TRUE" options (to the
>ReadAffy
> > function).  For example, I put two MOE430 CEL files in the working
> > directory, and here is what I tried:
> >
> >
> >
> >
> >>ab<-ReadAffy(filenames=list.celfiles(),rm.outliers=TRUE)
> >
> >
> >>ai<-compute.affinities(cdfName(ab))
> >
> >
> > .> data<-gcrma(ab,ai)
> >
> > Adjusting for optical effect..Done.
> >
> > Adjusting for non-specific binding.Error in
> > gcrma.bg.transformation.fast(pms, bhat, var.y, k = k) :
> >
> >         NAs are not allowed in subscripted assignments
>
>As you can see, you cannot have any NAs in your data to use gcrma. An
>alternative to this is to use the MBNI cdf/probe packages that have the
>probes with SNPs in the central 15 base pairs removed. Anything in this
>listing with SNP in the name has these probes removed.
>
>http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/CDF_d
>ownload_v6.asp
>
>Note that there are some downsides to using these cdfs, mainly that the
>standard errors of your estimates will be highly variable, since the
>probesets for these cdfs are quite variable in size (unlike the stock
>affy chip, where the vast majority have 11 probes).
>
>Best,
>
>Jim
>
>
> >
> >
> >>sessionInfo()
> >
> >
> > Version 2.3.1 (2006-06-01)
> >
> > i386-pc-mingw32
> >
> >
> >
> > attached base packages:
> >
> > [1] "splines"   "tools"     "methods"   "stats"     "graphics"
> > "grDevices"
> >
> > [7] "utils"     "datasets"  "base"
> >
> >
> >
> > other attached packages:
> >
> > mouse4302probe   mouse4302cdf          gcrma    matchprobes
> > affy
> >
> >       "1.10.0"       "1.10.0"        "2.6.0"        "1.4.0"
> > "1.12.2"
> >
> >         affyio        Biobase
> >
> >        "1.0.0"       "1.10.1"
> >
> >
> >
> >
> >
> > I have tried using both R versions 2.3.1 and 2.1.0, and gcrma versions
> > 1.1.4 and 2.6.0, and affy versions 1.12.2 and 1.10.0.  I get a similar
> > error when using the rm.mask=TRUE option.
> >
> >
> >
> > My overall goal is to remove select probes from the analysis (in this
> > case, probes that overlap known polymorphisms).  Any thoughts on how
> > best to do this are most appreciated...
> >
> >
> >
> > Cheers,
> >
> > -andrew
> >
> >
> >
> > --
> >
> > Andrew Su, Ph.D.
> >
> > Genomics Institute of the
> >
> >   Novartis Research Foundation
> >
> > asu at gnf.org
> >
> > Tel: 858-812-1656
> >
> > Fax: 858-812-1630
> >
> > http://web.gnf.org
> >
> >
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>--
>James W. MacDonald
>University of Michigan
>Affymetrix and cDNA Microarray Core
>1500 E Medical Center Drive
>Ann Arbor MI 48109
>734-647-5623
>
>
>
>**********************************************************
>Electronic Mail is not secure, may not be read every day, and should not
>be used for urgent or sensitive issues.
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Jenny Drnevich, Ph.D.

Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign

330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA

ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at uiuc.edu



More information about the Bioconductor mailing list