[BioC] Gene Ontology: Shortest path from root to node
Marc Carlson
mcarlson at fhcrc.org
Mon Jan 14 20:09:09 CET 2013
Hi Nicos,
You could use the GO.db package to get at this. In there you will find
an object called GOBPANCESTOR which acts like a classic R environment
object and can be used with the get() method to pull out the ancestor
terms of a given term all the way back to the root.
So for your example you could have done this:
library(GO.db)
get("GO:0008150", GOBPANCESTOR)
And you can see that the only ancestor to this term is in fact the root
node: "all"
What about terms further down? Well the same trick works for all the
terms to get their ancestor terms:
get("GO:0006955", GOBPANCESTOR)
So you probably want to do something a bit like this:
length(get("GO:0006955", GOBPANCESTOR))
And (for example) compare that to:
length(get("GO:0008150", GOBPANCESTOR))
etc.
Of course it's all a little bit more complicated than that because the
gene ontologies are actually DAGs (so terms can have more than one route
back to the main node), and so your ancestors list may be longer than
just the simple path back to the "all" node. And in fact in the example
I gave above this is true for the further down term "GO:0006955", which
has two routes back to the main node, and hence it's "distance" (as
hinted at by length) has been inflated by one in this case.
Anyhow, I hope this helps,
Marc
On 01/14/2013 07:47 AM, WoA [guest] wrote:
> Given some GO BP terms for a gene I wish to find out, which of the terms has more specific meaning. I wish to find out the length of the shortest path between the BP Root term(GO:0008150) and the given term. Is there any suitable way to do that using any R package?
>
> Like something equivalent to:
> my $length = $node->lengthOfShortestPathToRoot;
>
> in Perl's "GO-TermFinder" package.
>
> Thanks in advance
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 2.13.1 (2011-07-08)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list