[BioC] How are GO2PROBE built

Sean Davis sdavis2 at mail.nih.gov
Thu Oct 2 17:49:43 CEST 2008


On Thu, Oct 2, 2008 at 8:18 AM, john seers (IFR) <john.seers at bbsrc.ac.uk> wrote:
>
>
> Hi Sean
>
> Turning this into a more general question. Whenever I have to deal with
> a new type of Affymetrix array I seem to have to root around
> Bioconductor packages to find out how it is annotated etc. By the time I
> come around to do it again it has all changed and is done in a different
> way to how it was done before. My difficulty is it all feels a bit adhoc
> and comes at me in bits and pieces. Also I always feel there is probably
> a better way to do it that I am missing.
>
> Is there anywhere information that gives a better big picture that pulls
> it together a bit? What are the foundation designs/philosophy that all
> the packages are following? Is there a routemap type document that
> describes Bioconductor's approach to all this?

All of the annotation packages from Bioconductor follow the same
scheme and contain (generally, depending on organism) the same
information.  The AnnotationDbi package describes these packages in
detail in two vignettes and the help pages.  In short, though, all the
packages have a "key/value" concept where the key is typically some
gene/probe identifier and the values are annotation associated with
that gene/probe.  Currently (in Bioc-2.2 and forward), the
implementation of these packages is a SQLite database accessed via
RSQLite and with a significant API build on top of that.  Again, see
the AnnotationDbi documentation and code for details.

Hope that helps.

Sean


> ---
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Sean Davis
> Sent: 02 October 2008 11:55
> To: Oura Tomonori
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] How are GO2PROBE built
>
> On Thu, Oct 2, 2008 at 3:11 AM, Oura Tomonori <tomonori.oura at gmail.com>
> wrote:
>> Dear BioC,
>>
>> How are the mappings of Affymetrix probe ids to Gene Ontology terms in
>
>> metadata package provided by Bioconductor build?
>>
>> I am trying to use some gene set analysis packages and find some
>> pakage use the *GO2PROBE (ex. hgu133aGO2PROBE) information, but
>> another package use the external gene set definition, such as MSigDB.
>>
>> So I want to know the criteria for select specific GO term among
>> possible terms for each probe id in Bioconductor.
>> I already read the documents about AnnBuilder package, however.
>
> To make a long story short, the annotations available from affy are
> mapped to Entrez Gene IDs.  Then, the information from Entrez Gene--in
> this case, gene ontology--is mapped to affy id.  The dates associated
> with the data, the source of the data, and how the data are mapped will
> all affect the final mapping of affy ID to gene ontology.  The nice
> thing about gene ontology analyses is that they are typically based on
> "sets" of genes making it much less important to start with EXACTLY the
> same gene ontology mappings.  In fact, in practice, it will be pretty
> difficult to do so.
>
> If you want to see the details of the current Bioconductor annotation
> package build process, you want to read the AnnotationDbi SQLForge
> vignette, as AnnBuilder is outdated.
>
> Finally, if I have misunderstood your question, perhaps you could
> clarify.
>
> Sean
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list