[Bioc-devel] [BioC] lookUp for other terminologies

Martin Morgan mtmorgan at fhcrc.org
Mon Mar 7 17:57:13 CET 2011


On 03/07/2011 07:50 AM, Matthew Pocock wrote:
> I guess the ideal situation for me would be if I have a mapping
> entrez_id <-> do_id that I can publish this somewhere well-known and
> then every annotation that has entrez IDs would get annotations to the
> DO ids grandfathered in. I don't know how well that would fit into the
> general way bioconductor does things though. More generally, if there is
> a mapping from probe_id <-> foo_id provided as a foo.do package, and I
> have a mapping foo_id <-> bar_id, it would be nice if there was a method
> where I could register my mapping, and from then on, probe_id <-> bar_id
> would be available. Perhaps this is asking too much.

I wonder if you could be a bit more specific about the tasks you're
trying to accomplish? It sounds like you have an AnnDb object of some
kind, and can use the AnnDb API to manipulate it, e.g., mget("sym",
mymap). The problem seems to be in use of functions, like lookUp in this
particular case but probably in many other places, that try to construct
the map from character strings "DO" and "CHILDREN".

One could modify getAnnMap to also look, as a last resort, on the
search() path for the named symbol of appropriate type. Maybe this is
too much addressing the symptom rather than problem.

Also, fwiw, the 'envir' referenced below is just a variable name; the
mget method defined in AnnotationDbi dispatches on that variable to
handle AnnDb objects.

Martin

> 
> Matthew
> 
> On 7 March 2011 14:01, Vincent Carey <stvjc at channing.harvard.edu
> <mailto:stvjc at channing.harvard.edu>> wrote:
> 
>     On Mon, Mar 7, 2011 at 8:05 AM, Vincent Carey
>     <stvjc at channing.harvard.edu <mailto:stvjc at channing.harvard.edu>> wrote:
>     > I am going to move this thread to bioc-devel and comment further
>     there.
>     >
>     > On Mon, Mar 7, 2011 at 3:55 AM, Matthew Pocock
>     > <turingatemyhamster at gmail.com
>     <mailto:turingatemyhamster at gmail.com>> wrote:
>     >> Thanks Martin. I do have DO.db installed, and from the R shell I
>     can type:
>     >>
>     >>> getAnnMap("CHILDREN", "DO")
>     >> CHILDREN map for DO (object of class "AnnDbBimap")
>     >>
>     >> However, it was complaining about an environment.
> 
>     A reference was made to lookUp() which is legacy code defined in the
>     annotate package; this package has been superseded in various respects
>     by AnnotationDbi.  I would imagine that new code should avoid using
>     lookUp.
> 
>     The sort of substitution that is sought is to have something like
>     hu6800DO that would function just like the existing hu6800GO.  As
>     noted by Martin, the actual functionality of hu6800GO is dependent
>     both on its class structure and the fact that it is defined in the
>     hu6800.db package.  The documentation of getAnnMap also makes clear
>     the relationship to the package repertory.
> 
>     The most natural way for hu6800DO to be produced would be to extend
>     SQLforge tasks so
>     that it is produced as a part of hu6800.db next time around -- this
>     would depend on the existence of a DO->Entrez mapping; anyone with
>     such a mapping could follow the SQLforge instructions to build this
>     for a given chip/organism.
> 
>     I can understand the desire to take software that uses a *GO map, and
>     "drop in" a *DO map.  It does not seem possible to construct a *DO map
>     (that one could drop in) outside the SQLforge packaging discipline and
>     it may be worth your while to just bite the bullet and do this.
>     Alternatively it might be cost-effective to tweak the legacy code to
>     accept a list (perhaps formalized with S4) that defines the mapping.
> 
>     It seems relatively easy to build a FlatBimap on the fly, but no get
>     or mget method seems to be defined.  I don't know if this is
>     intentional, but if we want to allow more lightweight (less
>     package-dependent) use of Bimaps, perhaps this is a way to go.
> 
>     >>
>     >>> showMethods("mget")
>     >> Function: mget (package base)
>     >> x="ANY", envir="AnnDbBimap"
>     >> x="ANY", envir="ANY"
>     >> x="character", envir="DOTermsAnnDbBimap"
>     >>    (inherited from: x="ANY", envir="AnnDbBimap")
>     >> x="character", envir="GOTermsAnnDbBimap"
>     >>    (inherited from: x="ANY", envir="AnnDbBimap")
>     >> x="character", envir="ProbeAnnDbBimap"
>     >>    (inherited from: x="ANY", envir="AnnDbBimap")
>     >> x="character", envir="ProbeGo3AnnDbBimap"
>     >>    (inherited from: x="ANY", envir="AnnDbBimap")
>     >>
>     >> So, there's a version of mget that uses the environment
>     "DOTermsAnnDbBimap".
>     >> I am guessing that this is the environment that it can't find.
>     >>
>     >> Thanks for the hint - I'll track down and read that vignette. In
>     the mean
>     >> time, I'm writing code to do the lookup I need as a special case.
>     >>
>     >> Matthew
>     >>
>     >> On 7 March 2011 01:47, Martin Morgan <mtmorgan at fhcrc.org
>     <mailto:mtmorgan at fhcrc.org>> wrote:
>     >>
>     >>> On 03/06/2011 09:34 AM, Matthew Pocock wrote:
>     >>>
>     >>> > Error in mget(x, envir = getAnnMap(what, chip = data, load =
>     load),
>     >>> > ifnotfound = NA) :
>     >>> >   error in evaluating the argument 'envir' in selecting a
>     method for
>     >>> > function 'mget'
>     >>> >> traceback()
>     >>> > 3: mget(x, envir = getAnnMap(what, chip = data, load = load),
>     ifnotfound
>     >>> =
>     >>> > NA)
>     >>> > 2: lookUp(id, annotation, extension) at gtDO.R#144
>     >>> >
>     >>> > So it looks like getAnnMap is not happy. I've looked at the
>     source for
>     >>> > getAnnMap and have to own up to being mystified by it.
>     >>>
>     >>> Hi Matthew -- if I
>     >>>
>     >>>  getAnnMap("CHILDREN", "DO")
>     >>>
>     >>> then getAnnMap is going to look in search() for an entry that
>     might look
>     >>> like "package:DO.db" and then use that to find a map DOCHILDEN.
>     If it
>     >>> can't find the package in search(), it'll try to load it.
>     >>>
>     >>> What this means, I think, is that you'd want to create not just
>     a bimap
>     >>> but a package that contains the bimap. I think this is covered
>     in the
>     >>> AnnotationDbi package vignette "SQLForge: ...easy...". Not sure
>     this is
>     >>> the most straight-forward path to your objective...
>     >>>
>     >>> Martin
>     >>>
>     >>>
>     >>> >
>     >>> > I think the closest 'drop-in' replacement I can find in DO for
>     lookUp is
>     >>> the
>     >>> > Term() function, but I may be confused about what these two
>     different
>     >>> > functions do.
>     >>> >
>     >>> > Thanks,
>     >>> >
>     >>> > Matthew
>     >>> >
>     >>>
>     >>>
>     >>> --
>     >>> Computational Biology
>     >>> Fred Hutchinson Cancer Research Center
>     >>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>     >>>
>     >>> Location: M1-B861
>     >>> Telephone: 206 667-2793
>     >>>
>     >>
>     >>
>     >>
>     >> --
>     >> Matthew Pocock
>     >> mailto: turingatemyhamster at gmail.com
>     <mailto:turingatemyhamster at gmail.com>
>     >> gchat: turingatemyhamster at gmail.com
>     <mailto:turingatemyhamster at gmail.com>
>     >> msn: matthew_pocock at yahoo.co.uk <mailto:matthew_pocock at yahoo.co.uk>
>     >> irc.freenode.net <http://irc.freenode.net>: drdozer
>     >> (0191) 2566550
>     >>
>     >>        [[alternative HTML version deleted]]
>     >>
>     >> _______________________________________________
>     >> Bioconductor mailing list
>     >> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>     >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>     >> Search the archives:
>     http://news.gmane.org/gmane.science.biology.informatics.conductor
>     >>
>     >
> 
> 
> 
> 
> -- 
> Matthew Pocock
> mailto: turingatemyhamster at gmail.com <mailto:turingatemyhamster at gmail.com>
> gchat: turingatemyhamster at gmail.com <mailto:turingatemyhamster at gmail.com>
> msn: matthew_pocock at yahoo.co.uk <mailto:matthew_pocock at yahoo.co.uk>
> irc.freenode.net <http://irc.freenode.net>: drdozer
> (0191) 2566550
> 


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-devel mailing list