[BioC] GO's to gene's
Martin Morgan
mtmorgan at fhcrc.org
Mon Mar 1 04:30:34 CET 2010
On 02/28/2010 07:17 PM, Loren Engrav wrote:
> Thank you both
> Given my skills, it might be easier/quicker to do it "manually" with Amigo
> But I am trying both methods
>
> For the second method I get
>
>> library(GO.db)
> Loading required package: AnnotationDbi
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
> Vignettes contain introductory material. To view, type
> 'openVignette()'. To cite Bioconductor, see
> 'citation("Biobase")' and for packages 'citation(pkgname)'.
>
> Loading required package: DBI
>> terms <- Term(GOTERM)
> Error in function (classes, fdef, mtable) :
> unable to find an inherited method for function "Term", for signature
> "GOTermsAnnDbBimap"
>
>> sessionInfo()
> R version 2.9.2 Patched (2009-09-05 r49613)
> i386-apple-darwin9.8.0
>
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
,
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
Update to R version 2.10 and associated Bioc packages, or for a (much)
slower solution (you'll want to check that Term and Ontology return ids
in identical order)
terms = eapply(GOTERM, Term)
etc. I have
> sessionInfo()
R version 2.10.1 Patched (2010-02-23 r51168)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GO.db_2.3.5 RSQLite_0.7-3 DBI_0.2-4
[4] AnnotationDbi_1.8.1 Biobase_2.6.1
loaded via a namespace (and not attached):
[1] tools_2.10.1
Martin
>
>> From: Martin Morgan <mtmorgan at fhcrc.org>
>> Date: Sun, 28 Feb 2010 18:42:33 -0800
>> To: Vincent Carey <stvjc at channing.harvard.edu>
>> Cc: Loren Engrav <engrav at u.washington.edu>, "bioconductor at stat.math.ethz.ch"
>> <bioconductor at stat.math.ethz.ch>
>> Subject: Re: [BioC] GO's to gene's
>>
>> On 02/28/2010 06:14 PM, Vincent Carey wrote:
>>> Perhaps there is a package with such functionality. However, with the
>>> GO.db package in place, you need to do a little
>>> programming, perhaps along the lines of
>>>
>>> querGO = function(str, attr = "definition", ont = "MF") {
>>> require(GO.db, quietly = TRUE)
>>> gc = GO_dbconn()
>>> quer.1 = paste("select go_id, term from go_term where",
>>> attr, "like('%")
>>> quer.2 = "%') and ontology = '"
>>> quer.3 = "'"
>>> quer = paste(quer.1, str, quer.2, ont, quer.3, collapse = "",
>>> sep = "")
>>> dbGetQuery(gc, quer)
>>> }
>>>
>>> whereby
>>>
>>>> querGO("collagen", "term")
>>> go_id term
>>> 1 GO:0004656 procollagen-proline 4-dioxygenase activity
>>> 2 GO:0005518 collagen binding
>>> 3 GO:0008475 procollagen-lysine 5-dioxygenase activity
>>> 4 GO:0019797 procollagen-proline 3-dioxygenase activity
>>> 5 GO:0019798 procollagen-proline dioxygenase activity
>>> 6 GO:0033823 procollagen glucosyltransferase activity
>>> 7 GO:0042329 structural constituent of collagen and cuticulin-based cuticle
>>> 8 GO:0050211 procollagen galactosyltransferase activity
>>> 9 GO:0070052 collagen V binding
>>>>
>>
>> Also
>>
>> library(GO.db)
>> terms <- Term(GOTERM) # or maybe Definition(GOTERM) ?
>> ontologies <- Ontology(GOTERM)
>> collagen <- terms[grepl("collagen", terms) & ("MF" == ontologies)]
>>
>> and the next step,
>>
>> library(org.Hs.eg.db)
>> egids <- mget(names(collagen), org.Hs.egGO2EG, ifnotfound=NA)
>> egids <- egids[!is.na(egids)]
>>
>>
>>>
>>> On Sun, Feb 28, 2010 at 8:56 PM, Loren Engrav <engrav at u.washington.edu>
>>> wrote:
>>>> Is there a BioC package that will find all the GO terms containing some
>>>> word, like perhaps ³collagen²
>>>> And then find all the genes contained within those found terms
>>>>
>>>> I scanned
>>>> GoProfiles
>>>> GOSemSim
>>>> GOstats
>>>> GoTools and
>>>> TopGO
>>>>
>>>> And could not determine that any would do that.
>>>>
>>>> Thank you.
>>>>
>>>>
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> --
>> Martin Morgan
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list