[BioC] affy - Anotation of Afymetrix PrimeView chipe
James W. MacDonald
jmacdon at uw.edu
Wed Oct 23 17:38:42 CEST 2013
Hi Adam,
On Wednesday, October 23, 2013 10:44:34 AM, Adam Olejnik wrote:
> Dear all
> I am a ner person in analyzing microarrays with R.
> I ma working on dataset GSE41960.
> As far as I know RMA method summarises probes into genes, hoverer when I
> run limma and use topTable I have probe ID instead of gene names.
> Where in the pipeline I should merge the probes ?
As you note, RMA summarizes probes, but not into genes, but into
probesets. Probesets are intended to interrogate transcripts, which are
certainly not genes. However, most people end up collapsing transcripts
back to gene IDs, so maybe that is not relevant.
But to answer your question, you summarize when you run rma(). I am
assuming you are doing something like
library(affy)
dat <- ReadAffy()
eset <- rma(dat)
In which case the ExpressionSet object called 'eset' now contains the
summarized data, and limma knows what to do with it.
So if you fit a model and end up with an MArrayLM object
fit <- lmFit(eset, design)
fit2 <- contrasts.fit(fit, contrast)
fit2 <- eBayes(fit2)
You can then annotate these data using a primeview.db package. But note
that there isn't such a thing on the BioC website. It doesn't really
matter, as it is simple to create using the AnnotationForge package.
All you need to do is go to the Affy website, and get the primeview
annotation csv file
http://www.affymetrix.com/Auth/analysis/downloads/na32/ivt/PrimeView.na32.annot.csv.zip
Install AnnotationDbi and the human.db0 packages, and then do what I
recommend here:
https://stat.ethz.ch/pipermail/bioconductor/attachments/20130711/7e4d77fb/attachment.pl
You can then do
install.packages("primeview.db", type="source", repos=NULL)
and then
library(primeview.db)
gns <- select(primeview.db, featureNames(eset), c("SYMBOL","GENENAME"))
and if you don't get an error about 1 to many mappings you can do
fit2$genes <- gns
otherwise you can be super naive and just take the first mapping
available
fit2$genes <- gns[!duplicated(gns[,1]),]
and then topTable(fit, coef=1) will have annotated genes in it.
>
> The second question is about CDF file. I know these are file used for
> description and mapping of the probes. But I did not figured out how to use
> it with affy and limma.
This happens automatically.
Best,
Jim
>
> I am familiar with user guide for affy and limma.
>
> Many thanks in advance
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list