[BioC] Agilent spike-in probes

Sun Mar 30 22:33:49 CEST 2008

In my experience with Agilent arabdopsis arrays, some of the Agilent 
spike-ins bind only to one of the dyes (or bind much more strongly to 
one).  I always remove the controls before doing differential 
expression analysis.

Naomi

At 08:29 AM 3/30/2008, Sean Davis wrote:
>On Sat, Mar 29, 2008 at 11:39 PM, Srinivas Iyyer
><srini_iyyer_bio at yahoo.com> wrote:
> > dear sean,
> >  i apologize for sending this email and attached
> >  figures to you. I am not sure if I can send figures as
> >  attachment to mailing list.  I wanted to see expert
> >  opinion on this particular topic because this is first
> >  time i am analyzing agilent chip data.
> >  Would you please look into my design, code and figures
> >  and let me know if this method okay.
> >
> >  Spike-in probes are for QC purposes, if so why I am
> >  getting spike-in probes as top candidates. Is there a
> >  way to suppress them.
> >  Thank you and I appreciate your help.
> >
> >
> >
> >  dear group,
> >
> >  I have agilent 4x44 (G4112F) chips.  the hybs are done
> >  as a paired design. sample obtained from patient
> >  before and after treatment.  40 patient are in the
> >  study. chip was hybridized with before treated(cy3)
> >  and after treated (cy5) rna.
> >
> >  I used LIMMA for normalizing and to calcuate
> >  differentially expressed.
> >
> >  in the first step, I did not go for background
> >  subtraction and observed a blown-out ma plot.
>
>I'm not sure what "blown-out" means, but Agilent typically does
>background subtraction automatically (you'll need to look at the
>specific image extraction protocol to check).  If you use the
>gProcessedSignal and rProcessedSignal (these are not the defaults in
>limma), you will probably get the benefit of their spatially-detrended
>loess background subtraction.
>
> >  when i did background subtraction, i observed a more
> >  compact ma. For q-q plot points at intersection are
> >  not many suggesting that many genes are differentially
> >  expressed. (figures are attach
> >
> >  my main concern is, of top100 (from toptable
> >  number=100), most of the probesets are spikein
> >  probesets. (+)E1A_r60_a22 , DCP_22_6,DCP_22_7 and so
> >  on.
>
>This could be dye bias, but I'm not sure.  You didn't do dye swaps, so
>you cannot separate signal from dye bias.  In any case, you will need
>to do some QC.  Agilent provides a huge amount of QC and plots on the
>scanner machine.  You can always look there to see what they do.
>Also, their technical manuals are pretty good at giving direction
>about the technology and the array data processing.
>
> >  These spike-in probes are highly differentlly
> >  expressed.
> >
> >
> >  my targets file
> >
> >  filename   cy3  cy5
> >  patient1  before after
> >  patient2  before after
> >  ......
> >  patient40 before after
> >
> >  my design matrix:
> >  desin <- modelMatrix(targets,ref='before')
> >  > desin
> >       after
> >   [1,]      1
> >   [2,]      1
> >   [3,]      1
> >   [4,]      1
> >   [5,]      1
> >   [6,]      1
> >   [7,]      1
> >   [8,]      1
> >
> >  RG2 <- backgroundCorrect(RG,method='subtract')
> >  MA2 <- normalizeWithinArrays(RG2,method='loess')
> >  plotDensities(MA2)
> >  boxplot(MA2$M~col(MA2$M),names=colnames(MA2$M))
> >  MA2a <- normalizeBetweenArrays(MA2,method='scale')
>
>These are two-color arrays.  Do you really need to do the
>between-array normalization?  You might, but I think you might spend
>some time proving to yourself that is the case.
>
> >  fit.b <- lmFit(MA2a,design)
> >  fit.b <- eBayes(fit.b)
> >  topTable(fit.b,number=50,adjust.method='BH')[,c(5,9,10,11,12,13)]
> >
> >  my questions are:
> >
> >  1. for this paired sample (cy3,cy5) design, is my
> >  limma model matrix okay.
> >  2. how to avoid getting spike-in . I never saw
> >  spike-in getting into top-table. is there some mistake
> >  going on at some place. is it normal for spike-in
> >  probes to come as top differentially expressed probes.
>
>It happens, yes.  I would definitely do some QC, though.  It doesn't
>look like you have done any in your code here.
>
> >  3. are the attached figures (MA plot and q-q plot)
> >  reflect a good normalized data.
>
>The qq plot does not really tell you about normalization.  The single
>MA plot looks OK.  You will want to look at all of the MA plots and
>some more extensive QC.
>
> >  4. my chip is hgug4112F. I do not see annotation file
> >  on bioconductor.
>
>I think the hgug4112a annotation package is what you want.  You'll
>want to double-check that with a few lookups to be sure.
>
>Sean
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: 
>http://news.gmane.org/gmane.science.biology.informatics.conductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111