[BioC] Annotation.db: how automatically call a mapping?
Hooiveld, Guido
Guido.Hooiveld at wur.nl
Tue Jun 30 21:54:03 CEST 2009
Hi Jim,
Since your suggestion looks indeed easy and seems to provide everything
I would like to have, a gave it a try (I'll have a further look at
Martin's comments/suggestions later).
However, it seems that 'probes2table' cannot properly handle multiple
contrast defined in limma:
> library(affycoretools)
Loading required package: GO.db
Loading required package: DBI
Loading required package: KEGG.db
> probes2table(eset, featureNames(eset), annotation(eset),
list("p-value"= fit2$p.value, "mean" = fit2$Amean), html = FALSE,
filename = "output")
Loading required package: moe430a.db
Error in aafTable(items = otherdata) :
All columns must be of equal length
> length(fit2$p.value)
[1] 68070
>
> length(fit2$Amean)
[1] 22690
>
I analyzed three contrasts in limma, and 68070/3 is 22690, which exactly
equals the number of probesets on the Affy MOE430A array. This thus
explains the error.
Question: can this easily be solved? Can limma2annaffy handle multiple
contrasts? (at the moment I am not familiar with affycoretools at all).
Thanks,
Guido
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
> James W. MacDonald
> Sent: 30 June 2009 19:44
> To: Hooiveld, Guido
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Annotation.db: how automatically call a mapping?
>
> Hi Guido,
>
> Hooiveld, Guido wrote:
> > Hi Martin,
> >
> > Indeed, another useful, straigh-forward possibility for mapping.
> > However, I am now facing the problem of properly combining the
> > annotation info with the expression data. This is what I
> would like to
> > do:
> >
> >> Tab_data <- exprs(eset[probeids])
> >> Tab_data <- cbind(Tab_data, fit2$Amean) # to add average
> expression
> >> of
> > LIMMA output
> >> Tab_data <- cbind(Tab_data, fit2$p.value) # to add p-value of LIMMA
> > output
> > etc.
> >
> > This al goes fine, however adding the annotation info
> 'mixes-up' the
> > content of Tab_data; the annotation data replaces the first
> column of
> > Tab_data, and the content of all cells is replaced by 'null'. I
> > suspect it has something to do with the type of object I
> would like to
> > merge, but I am not sure.
> >
> >> map.entrez <- getAnnMap("ENTREZID", annotation(eset))
> map.entrez <-
> >> as.list(map.entrez[probeids])
>
> This sort of thing is going to get really difficult to do by
> hand when you get to things that have a one-to-many
> relationship. And you are already duplicating existing
> efforts with what you have done so far.
>
> If you want to combine annotation data with results data, you
> really want to be using the annaffy package which does lots
> of these things seamlessly. And if you want things to be a
> bit easier, you could consider using the affycoretools
> package as well, which for the most part uses annaffy to
> create output.
>
> You can do what you appear to want in one line:
>
> probes2table(eset, featureNames(eset), annotation(eset),
> list("p-value"= fit2$p.value, "mean" = fit2$Amean), html =
> FALSE, filename = "output")
>
> You might need to play around with the 'anncols' argument to
> get what annotation data you might want.
>
> If you want output specific to the contrasts you have fit,
> see ?limma2annaffy.
>
> Best,
>
> Jim
>
>
>
> >
> >
> >> Tab_data <- cbind(Tab_data, map.entrez)
> > ^ in R this seems to work, but when saved as .txt the content of
> > Tab_data is completely mixed up. Before 'adding' map.entrez
> Tab_dat is
> > OK.
> >
> >
> >> write.table(cbind(rownames(Tab_data2), Tab_data2),
> > file="test_1234.txt", sep="\t", col.names=TRUE, row.names=FALSE)
> >
> >> class(Tab_data)
> > [1] "matrix"
> >> class(map.entrez)
> > [1] "list"
> >
> >
> > Do you, or someone elsr, have a suggestion how to properly
> link these
> > two types of data?
> > Thanks again,
> > Guido
> >
> >
> >
> >
> >
> >> -----Original Message-----
> >> From: bioconductor-bounces at stat.math.ethz.ch
> >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf
> Of Martin
> >> Morgan
> >> Sent: 30 June 2009 00:00
> >> To: Hooiveld, Guido
> >> Cc: bioconductor at stat.math.ethz.ch
> >> Subject: Re: [BioC] Annotation.db: how automatically call
> a mapping?
> >>
> >> Hooiveld, Guido wrote:
> >>> Hi,
> >>>
> >>> I am facing a problem i cannot solve myselves, despite
> everything i
> >>> read/know. But i assume the solution is easy for the more
> >> knowledgable
> >>> folks in BioC/R...
> >>>
> >>> This does work:
> >>>> library(moe430a.db)
> >>>> xxyy <- moe430aSYMBOL
> >>>> xxyy
> >>> SYMBOL map for chip moe430a (object of class "AnnDbBimap")
> >>>
> >>> However, for this to work you need to know the array type
> >> of the data
> >>> that is analyzed.
> >>>
> >>>
> >>> Now i would like to automatically extract the (e.g.)
> SYMBOL mapping
> >>> from an annotation.db, thus by retrieving the array type
> >> from the eset.
> >>>
> >>>> library(affy)
> >>>> eset <- rma(data)
> >>>> probeids <- featureNames(eset)
> >>>> annotation(eset)
> >>> [1] "moe430a"
> >>>
> >>> But how can i use this info to properly call the SYMBOL mapping?
> >> Hi Guido --
> >>
> >> to get the appropriate map
> >>
> >> library(annotate)
> >> map = getAnnMap("SYMBOL", annotation(eset))
> >>
> >> to select just the relevant probes
> >>
> >> map[probeids]
> >>
> >> toTable(map[probeids]) or as.list(map[probeids]) might be the next
> >> step in the work flow.
> >>
> >> Martin
> >>
> >>>
> >>> I tried this:
> >>>> arraytype <- annotation(eset)
> >>>> arraytype <- paste(arraytype, "db", sep = ".") arraytype
> >>> [1] "moe430a.db"
> >>>> arraytype <- paste("package", arraytype, sep = ":") gh <-
> >>>> ls(arraytype) gh
> >>> [1] "moe430a" "moe430a_dbconn"
> "moe430a_dbfile"
> >>> "moe430a_dbInfo" "moe430a_dbschema" "moe430aACCNUM"
> >>> "moe430aALIAS2PROBE" "moe430aCHR" "moe430aCHRLENGTHS"
> >>> "moe430aCHRLOC"
> >>> [11] "moe430aCHRLOCEND" "moe430aENSEMBL"
> >>> "moe430aENSEMBL2PROBE" "moe430aENTREZID" "moe430aENZYME"
> >>> "moe430aENZYME2PROBE" "moe430aGENENAME" "moe430aGO"
> >>> "moe430aGO2ALLPROBES" "moe430aGO2PROBE"
> >>> [21] "moe430aMAP" "moe430aMAPCOUNTS" "moe430aMGI"
> >>> "moe430aMGI2PROBE" "moe430aORGANISM" "moe430aPATH"
> >>> "moe430aPATH2PROBE" "moe430aPFAM" "moe430aPMID"
> >>> "moe430aPMID2PROBE"
> >>> [31] "moe430aPROSITE" "moe430aREFSEQ" "moe430aSYMBOL"
> >>> "moe430aUNIGENE" "moe430aUNIPROT"
> >>>
> >>>> gh[33]
> >>> [1] "moe430aSYMBOL"
> >>>> symbols <- mget(probeids, gh[33])
> >>> Error in mget(probeids, gh[33]) : second argument must be an
> >>> environment
> >>>
> >>> This also doesn't work:
> >>>> symbols <- mget(probeids, envir=gh[33])
> >>> Error in mget(probeids, envir = gh[33]) :
> >>> second argument must be an environment
> >>>
> >>> My approach thus is the wrong approach to automatically extract
> >>> mappings from a annotation.db.
> >>> Since i don't know about any other possibility, i would
> >> appreciate if
> >>> someone could point me to a working solution.
> >>>
> >>> Thanks,
> >>> Guido
> >>>
> >>>
> >>> ------------------------------------------------
> >>> Guido Hooiveld, PhD
> >>> Nutrition, Metabolism & Genomics Group Division of Human
> Nutrition
> >>> Wageningen University Biotechnion, Bomenweg 2
> >>> NL-6703 HD Wageningen
> >>> the Netherlands
> >>> tel: (+)31 317 485788
> >>> fax: (+)31 317 483342
> >>> internet: http://nutrigene.4t.com <http://nutrigene.4t.com/>
> >>> email: guido.hooiveld at wur.nl
> >>>
> >>>
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >>
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list