[BioC] How to do Affy ST array analysis

Wed May 7 18:22:58 CEST 2014

Hi Ninni,

no, you need to switch, there is an annotation data base for every single platform (thanks to Jim MacDonald!), search here:

http://bioconductor.org/packages/release/BiocViews.html#___AnnotationData

The 1.0 one is here:

http://bioconductor.org/packages/release/data/annotation/html hugene10sttranscriptcluster.db.html

Best wishes,

Bernd

On Wed, 7 May 2014 18:17:19 +0200
Ninni Nahm <ninninahm at gmail.com> wrote:

> Thank you! That was very very helpful!
> I wanted to ask if I can use the hugene20sttranscriptcluster.db package for
> all hugene arrays? I have one more to analyze, which is a st 1 array.
> Best
> Ninni
> 
> 
> 
> On Wed, May 7, 2014 at 2:57 PM, Bernd Klaus <bernd.klaus at embl.de> wrote:
> 
> > Hi Ninni,
> >
> > I guess a very simple workflow would be:
> >
> > 1.read celfiles
> > library(oligo)
> > rawData = read.celfiles(< character vector of celfiles >)
> >
> > 2. perform RMA and get "transcript cluster" summarized data back
> >  using only "core" genes ("safely" annotated genes according to affy)
> >  this is the default in oligo.
> >
> > Eset = rma(rawData,target="core")
> >
> > 3. Load annotation package and annotate "transcript clusters" with some
> > stuff contained in that package.
> >
> > ## load Annotation package
> > library("hugene20sttranscriptcluster.db")
> >
> >         annotateGene = function ( db , what , missing ) {
> >         tab = toTable(db[intersect(featureNames(Eset),  mappedkeys(db)) ])
> >         mt = match ( featureNames ( Eset ) , tab$probe_id )
> >         ifelse ( is.na(mt), missing , tab[[ what ]][ mt ])
> >         }
> >
> >
> > fData(Eset)$symbol = annotateGene( hugene20sttranscriptclusterSYMBOL
> > ,"symbol" , missing = NA )
> > fData(Eset)$genename = annotateGene( hugene20sttranscriptclusterGENENAME ,
> > "gene_name" , missing = NA )
> > fData(Eset)$ensembl = annotateGene( hugene20sttranscriptclusterENSEMBL ,
> > "ensembl_id" , missing = NA )
> >
> >
> > 4. After that keep only the "transcript clusters"  that have a ENSEMBL
> > Gene ID.
> > (for example)
> >
> >
> > Hope that helps,
> >
> > Bernd
> >
> > On Wed,  7 May 2014 05:06:00 -0700 (PDT)
> > "Ninni Nahm \[guest\]" <guest at bioconductor.org> wrote:
> >
> > >
> > > Hi all!
> > >
> > > I am feeling a little bit stupid, but I have been searching for two days
> > now (maybe I search wrong?!) and could not figure it out.
> > > I want to analyze a Human Gene st array.
> > > I know that there is the oligo package, I found this annotation package
> > here pd.hugene.2.0.st, but, I do not know how to do the steps. I am used
> > to the affy package and affy pipelines.
> > > All I find when searching for solutions are ways on how to make your own
> > annotation package, that is not necessary, I think, because I found the
> > pd.hugene.2.0.st. Or am I wrong? Somehow I can t use it in the same way
> > as I do with the for example hgu133a.db package that provides me the
> > annotations.
> > >
> > > Im really lost...
> > >
> > > I want to do:
> > >
> > > - probe level analysis (similar to affyplm)
> > > - RMA normalization (Somehow oligo does this, I think)
> > > - Filter probes that are controls (as one does with affy: AFFX, for
> > hgu133a)
> > > - annotation of probesets (normally, I would use the IQR filter to get
> > unique entrez ids, but how do I do this with the ST array?)
> > >
> > >
> > > I know that there is something about probe and transcript to be aware of
> > and core? But I cannot connect the workflow.
> > >
> > > I would be so happy if someone helped me, pointed me to the right docs.
> > (the oligo userguide is not so helpful for me because I still dont
> > understand what to do with what and when...) Sorry!
> > >
> > > Thanks!
> > >
> > > Ninni
> > >
> > >  -- output of sessionInfo():
> > >
> > > -
> > >
> > > --
> > > Sent via the guest posting facility at bioconductor.org.
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
> >