[BioC] limma - 2-color agilent microarray, GO annotation
Gordon K Smyth
smyth at wehi.EDU.AU
Mon Jan 2 06:43:11 CET 2012
Dear Stephen,
As I understand it, from a Rosetta seminar I attended nearly a decade ago,
Rosetta Resolver fits a statistical model to all the genes simultaneously
that allows it to assign p-values for a single array, i.e., without
replicates. I don't know whether the Rosetta method has been published,
or whether it's purely proprietry, but however it works it seems to me
that it can only be estimating technical variation, given the fact that it
is not using biological replicates in your case. Certainly limma would
refuse to provide p-values for you for single arrays.
Suppose you want to test for genes that tend to be differentially
expressed between mutants and controls in general. This could be analysed
in limma as a simple replicated comparison (Section 8.1.1 of the User's
Guide), but using duplicateCorrelation with block=Cy3 to estimate a
correlation between the pairs of arrays sharing the same control.
You can't do anything about gene-specific dye effects. They will be
confounded with differential expression results. Given the experimental
design, you just have to live with it.
You don't need a special annotation library for the array. All you need
is an annotation column that you can map to gene symbols, then use the
Bioconductor organism package for your species (eg org.Hs.eg.db) to map
from symbols to GO terms. limma will probably read in annotation columns
for you automatically. The GOstats package can help with mapping to GO
terms. Or else just write gene identifiers to a file and input them
offline into the NCBI DAVID tool.
Best wishes
Gordon
> Date: Tue, 20 Dec 2011 15:04:03 -0500
> From: Stephen Turner <vustephen at gmail.com>
> To: bioconductor at r-project.org
> Subject: [BioC] limma - 2-color agilent microarray, GO annotation
>
> I'm working with a group that had an outside vendor run four Aglient
> 2-color oligo arrays for two control samples labeled with Cy3 (C1, C2) and
> four mutants labeled with Cy5 (M1, M2, M3, M4).
>
> Array 1: C1 vs M1
> Array 2: C1 vs M2
> Array 3: C2 vs M3
> Array 4: C2 vs M4
>
> The outside vendor sent them four spreadsheets, one for each array,
> containing normalized Cy5/Cy3 log-ratios, Cy5/Cy3 fold changes, sequence
> description and p-values. I don't know what they did to calculate a p-value
> for each n=1 comparison on each array like this. The only information that
> the company that did her analysis gave her her about what they did was that
> they "used Rosetta Resolver to export gene list with log ratios, fold
> changes, and p-values." But this was done separately for each of the four
> arrays.
>
> My first question: using this arrangement of samples on the arrays, is it
> possible to directly compare all the controls versus all the mutants? What
> about confounding with the dye (there was no dye swap)? Are C1vsM1 and
> C1vsM2 true bio replicates because C1 is used in both comparisons (same
> with the C2 vsM3/M4 comparison)?
>
> The vendor sent "raw" data from Agilent Feature Extraction software, and it
> looks like I may be able to read this into R using read.maimages, but I
> couldn't find an annotation package. The platform is a two color Agilent
> array, design #028005. I checked the BioC annotation pages (
> http://www.bioconductor.org/packages/release/data/annotation/) and there
> were a few other agilent chips there, but not this one.
>
> The second question: After I run the analysis, I want to annotate the
> topTable genes with GO terms and do a GO enrichment analysis. How might I
> go about doing this without an annotation package?
>
> One last unrelated question: is it appropriate to send a relevant job
> opening announcement over the mailing list?
>
> Thanks in advance for any help.
>
> Stephen
>
> -----------------------------------------
> Stephen D. Turner, Ph.D.
> Bioinformatics Core Director
> Department of Public Health Sciences
> University of Virginia School of Medicine
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list