[BioC] Agilent spike-in probes
Naomi Altman
naomi at stat.psu.edu
Sun Mar 30 22:33:49 CEST 2008
In my experience with Agilent arabdopsis arrays, some of the Agilent
spike-ins bind only to one of the dyes (or bind much more strongly to
one). I always remove the controls before doing differential
expression analysis.
Naomi
At 08:29 AM 3/30/2008, Sean Davis wrote:
>On Sat, Mar 29, 2008 at 11:39 PM, Srinivas Iyyer
><srini_iyyer_bio at yahoo.com> wrote:
> > dear sean,
> > i apologize for sending this email and attached
> > figures to you. I am not sure if I can send figures as
> > attachment to mailing list. I wanted to see expert
> > opinion on this particular topic because this is first
> > time i am analyzing agilent chip data.
> > Would you please look into my design, code and figures
> > and let me know if this method okay.
> >
> > Spike-in probes are for QC purposes, if so why I am
> > getting spike-in probes as top candidates. Is there a
> > way to suppress them.
> > Thank you and I appreciate your help.
> >
> >
> >
> > dear group,
> >
> > I have agilent 4x44 (G4112F) chips. the hybs are done
> > as a paired design. sample obtained from patient
> > before and after treatment. 40 patient are in the
> > study. chip was hybridized with before treated(cy3)
> > and after treated (cy5) rna.
> >
> > I used LIMMA for normalizing and to calcuate
> > differentially expressed.
> >
> > in the first step, I did not go for background
> > subtraction and observed a blown-out ma plot.
>
>I'm not sure what "blown-out" means, but Agilent typically does
>background subtraction automatically (you'll need to look at the
>specific image extraction protocol to check). If you use the
>gProcessedSignal and rProcessedSignal (these are not the defaults in
>limma), you will probably get the benefit of their spatially-detrended
>loess background subtraction.
>
> > when i did background subtraction, i observed a more
> > compact ma. For q-q plot points at intersection are
> > not many suggesting that many genes are differentially
> > expressed. (figures are attach
> >
> > my main concern is, of top100 (from toptable
> > number=100), most of the probesets are spikein
> > probesets. (+)E1A_r60_a22 , DCP_22_6,DCP_22_7 and so
> > on.
>
>This could be dye bias, but I'm not sure. You didn't do dye swaps, so
>you cannot separate signal from dye bias. In any case, you will need
>to do some QC. Agilent provides a huge amount of QC and plots on the
>scanner machine. You can always look there to see what they do.
>Also, their technical manuals are pretty good at giving direction
>about the technology and the array data processing.
>
> > These spike-in probes are highly differentlly
> > expressed.
> >
> >
> > my targets file
> >
> > filename cy3 cy5
> > patient1 before after
> > patient2 before after
> > ......
> > patient40 before after
> >
> > my design matrix:
> > desin <- modelMatrix(targets,ref='before')
> > > desin
> > after
> > [1,] 1
> > [2,] 1
> > [3,] 1
> > [4,] 1
> > [5,] 1
> > [6,] 1
> > [7,] 1
> > [8,] 1
> >
> > RG2 <- backgroundCorrect(RG,method='subtract')
> > MA2 <- normalizeWithinArrays(RG2,method='loess')
> > plotDensities(MA2)
> > boxplot(MA2$M~col(MA2$M),names=colnames(MA2$M))
> > MA2a <- normalizeBetweenArrays(MA2,method='scale')
>
>These are two-color arrays. Do you really need to do the
>between-array normalization? You might, but I think you might spend
>some time proving to yourself that is the case.
>
> > fit.b <- lmFit(MA2a,design)
> > fit.b <- eBayes(fit.b)
> > topTable(fit.b,number=50,adjust.method='BH')[,c(5,9,10,11,12,13)]
> >
> > my questions are:
> >
> > 1. for this paired sample (cy3,cy5) design, is my
> > limma model matrix okay.
> > 2. how to avoid getting spike-in . I never saw
> > spike-in getting into top-table. is there some mistake
> > going on at some place. is it normal for spike-in
> > probes to come as top differentially expressed probes.
>
>It happens, yes. I would definitely do some QC, though. It doesn't
>look like you have done any in your code here.
>
> > 3. are the attached figures (MA plot and q-q plot)
> > reflect a good normalized data.
>
>The qq plot does not really tell you about normalization. The single
>MA plot looks OK. You will want to look at all of the MA plots and
>some more extensive QC.
>
> > 4. my chip is hgug4112F. I do not see annotation file
> > on bioconductor.
>
>I think the hgug4112a annotation package is what you want. You'll
>want to double-check that with a few lookups to be sure.
>
>Sean
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>http://news.gmane.org/gmane.science.biology.informatics.conductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list