[BioC] Quick start to linking GO terms and microarray data
Giovanni Coppola
gcoppola at ucla.edu
Wed Mar 1 18:11:26 CET 2006
Hello Steffen,
how do I connect to Wormbase?
thanks
Giovanni
> listMarts()
[1] "ensembl_mart_37" "vega_mart_37" "snp_mart_37" "msd_mart_4"
[5] "uniprot_mart_17"
> sessionInfo()
R version 2.2.1, 2005-12-20, powerpc-apple-darwin7.9.0
other attached packages:
biomaRt XML RMySQL DBI
"1.4.0" "0.99-6" "0.5-7" "0.1-10"
On Mar 1, 2006, at 5:42 AM, Steffen Durinck wrote:
> Hi,
>
> Next to Ensembl, biomaRt currently includes Wormbase, VEGA, Uniprot
> and msd.
> Soon I expect plants to be represented as well via the Gramene
> database
> (http://www.gramene.org).
>
> Best,
> Steffen
>
>
> michael watson (IAH-C) wrote:
>
>> Hi Steffen, Wolfgang
>>
>> Thanks a lot, the biomaRt package looks wonderful for the species
>> that
>> are in ensembl... Are there any functions within it to annotate other
>> species? (Eg bacteria, plants etc)
>>
>> Many thanks
>> Mick
>>
>> -----Original Message-----
>> From: Steffen Durinck [mailto:sdurinck at ebi.ac.uk]
>> Sent: 01 March 2006 13:24
>> To: michael watson (IAH-C)
>> Cc: Sean Davis; Bioconductor
>> Subject: Re: [BioC] Quick start to linking GO terms and microarray
>> data
>>
>> Hi Mike,
>>
>> As Wolfgang already suggested you can do this with the biomaRt
>> package.
>> Here is how should do this:
>>
>>> library(biomaRt)
>> Loading required package: XML
>> Loading required package: RCurl
>>> mart = useMart("ensembl",dataset="hsapiens_gene_ensembl")
>> Checking attributes and filters ... ok
>>> getGO(id=c(100,620),type="entrezgene",mart=mart)
>>
>> go_id go_description
>> evidence_code
>> 1 GO:0004000 adenosine deaminase
>> activity TAS
>> 2 GO:0016787 hydrolase
>> activity IEA
>> 3 GO:0009117 nucleotide
>> metabolism IEA
>> 4 GO:0009168 purine ribonucleoside monophosphate
>> biosynthesis IEA
>> 5 GO:0019735 antimicrobial humoral response (sensu
>> Vertebrata) TAS
>> 6 GO:0006955 immune
>> response IMP
>> 7 GO:0006955 immune
>> response IEA
>> 8 GO:0006163 purine nucleotide
>> metabolism IMP
>> 9 GO:0006163 purine nucleotide
>> metabolism IEA
>> 10 GO:0005737
>> cytoplasm IDA
>> 11 GO:0005737
>> cytoplasm IEA
>> ensembl_gene_id ensembl_transcript_id
>> 1 ENSG00000196839 ENST00000359372
>> 2 ENSG00000196839 ENST00000359372
>> 3 ENSG00000196839 ENST00000359372
>> 4 ENSG00000196839 ENST00000359372
>> 5 ENSG00000196839 ENST00000359372
>> 6 ENSG00000196839 ENST00000359372
>> 7 ENSG00000196839 ENST00000359372
>> 8 ENSG00000196839 ENST00000359372
>> 9 ENSG00000196839 ENST00000359372
>> 10 ENSG00000196839 ENST00000359372
>> 11 ENSG00000196839 ENST00000359372
>>
>>
>> best,
>> Steffen
>>
>> michael watson (IAH-C) wrote:
>>
>>
>>
>>> Thanks Sean, but I really wanted to demonstrate this in Bioconductor
>>>
>>>
>> :-S
>>
>>
>>> I tried running the vignettes in goTools, the first time it froze
>>> up my
>>> PC for about 30 minutes and then gave out a cryptic message about
>>> coercing x to a list, the second time it froze up my PC and then R
>>> crashed with no warning :-S
>>>
>>> As far as I can tell, GOStats doesn't have any clear examples of
>>> simple
>>> mapping of microarray data to GO terms.
>>>
>>> Given that one of the major, fundamental tasks biologists want to
>>> do is
>>> find out functional information for significantly differentailly
>>> expressed genes, shouldn't this be a little easier, and a little
>>> more
>>> transparent, in bioconductor?
>>>
>>> Again, I ask, does anyone have any simple examples of going from
>>> a list
>>> of LocusLink IDs to a list of GO Terms? (i.e. GO identifiers and
>>> the
>>> biological function/term associated with those identifiers)
>>>
>>> Many thanks
>>> Mick
>>>
>>> -----Original Message-----
>>> From: Sean Davis [mailto:sdavis2 at mail.nih.gov]
>>> Sent: 01 March 2006 11:44
>>> To: michael watson (IAH-C); Bioconductor
>>> Subject: Re: [BioC] Quick start to linking GO terms and
>>> microarray data
>>>
>>>
>>>
>>>
>>> On 3/1/06 6:20 AM, "michael watson (IAH-C)"
>>>
>>>
>> <michael.watson at bbsrc.ac.uk>
>>
>>
>>> wrote:
>>>
>>>
>>>
>>>
>>>
>>>> Hi
>>>>
>>>> I want to investigate the GO terms associated with my microarray
>>>> data
>>>> (normally, a list of genes from topTable() in limma)
>>>>
>>>> I have read the vignettes for goTools and GOStats, and to be
>>>> honest, I
>>>> am still a little unclear what the overall process is,
>>>> particularly if
>>>>
>>>>
>>>>
>>>>
>>> I
>>>
>>>
>>>
>>>
>>>> am working with a custom array and not with affy or operon.
>>>>
>>>> Lets say, for example, I have my array data in a data.frame
>>>> containing
>>>> gene names. In a separate data frame I have a link between my gene
>>>> names and LocusLink IDs. How do I:
>>>>
>>>> 1) Find the GO terms associated with subsets of my genes? (I
>>>> realise I
>>>> can use merge() to link my array data to the LocusLink ids, but
>>>> what
>>>>
>>>>
>>>>
>>>>
>>> do
>>>
>>>
>>>
>>>
>>>> I do then?)
>>>>
>>>> 2) Fins out if a particular GO term is statistically over-
>>>> represented
>>>>
>>>>
>>>>
>>>>
>>> in
>>>
>>>
>>>
>>>
>>>> a particular group
>>>>
>>>>
>>>>
>>>>
>>> Hi, Mick.
>>>
>>> I would take your locuslink IDs for your genes and dump out two
>>> lists
>>>
>>>
>> to
>>
>>
>>> a
>>> text file:
>>>
>>> 1) All LocusIDs on your array.
>>> 2) All LoucsIDs in your genelist.
>>>
>>> Then use an external program or web tool such as DAVID/EASE to do
>>> the
>>> analysis.
>>>
>>> That said, there was some discussion on using straight locusIDs
>>> (rather
>>> than
>>> requiring a metadata package) in GOHyperG. I don't know where that
>>> conversion stands.
>>>
>>> As to your question about linking genes to GO, that is actually
>>> done at
>>> the
>>> transcript/protein level. Merging to entrez gene (locuslink)
>>> happens
>>> after
>>> the fact. Using various data sources, you can link by refseq,
>>> locuslink,
>>> ensembl ids, ucsc knowngenes, human invitational ids (human), and
>>> probably
>>> several others in species other than human.
>>>
>>> Sean
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
More information about the Bioconductor
mailing list