[BioC] custom-made annotation packages

Marc Carlson mcarlson at fhcrc.org
Wed May 28 20:27:57 CEST 2008


Hi Sam,

SQLForge will make almost all of the mappings (GO, KEGG, etc) for you.  
All you need to provide is a single mapping from whatever probe type of 
IDs you want to use mapped onto the appropriate TAIR IDs.  Then you 
should only have to download two packages and call a single function to 
make the package (read the SQLForge vignette for details).  The 
ath1121501.db package that you are referring to was made using this same 
function, only its mapping was a file from Affymetrix.  As long as you 
know what probe goes with what TAIR ID, we should be able to make a 
package from that.

So for example you just need to have a bunch of IDs (these can be 
anything you want) that are mapped to TAIR IDs (which look like: 
ATxxxxxxx) then you need a tab separated file that looks something like:

myID1    AT1G01040
myID2    AT1G01060
myID3    AT1G01120

etc.


    Marc




Samuel Wuest wrote:
> Hi Marc, thanks for the quick reply! So you guess it should be rather 
> easy to make such an annotation package?
> Just to be correct with this: the ath1121501.db package uses the 
> affy_id as a global identifier (key values), not ATG Ids?
> I'd have to create all the mappings from scratch (the GO-mappings, the 
> KEGG mappings, etc).
>
> Best, Sam
>
> 2008/5/28 Marc Carlson <mcarlson at fhcrc.org <mailto:mcarlson at fhcrc.org>>:
>
>     Hi Sam,
>
>     Maybe you should just use SQLForge to quickly make a custom
>     annotation package for yourself.  If you have ATG IDs already
>     mapped to some kind of probeset ID, then this should be
>     straightforward for you.  Nobody has needed to do this yet, so you
>     will be a guinea pig making a package for arabidopsis, but it
>     should work, and if it works, it should be fairly quick.  Have a
>     look at the Vignette to see the more general use case and let me
>     know if you have questions.
>
>     You can find the vignette here:
>
>     http://bioconductor.org/packages/2.2/bioc/html/AnnotationDbi.html
>
>     All you will need is some sort of mapping from whatever you want
>     to use as probe labels to arabidopsis "TAIR" gene IDs (these look
>     like: ATxxxxxxx).  This mapping can then be used to make a custom
>     annotation package.
>
>
>       Marc
>
>
>
>     Samuel Wuest wrote:
>
>         Hi all,
>
>         This is a conceptual question on how (if at all) to create
>         custom-made
>         annotations:
>         I am using the Affymetrix plattform for Arabidopsis (ATH1),
>         and the newest
>         annotation package (AnnDbBimap objects mapping AffyIDs to e.g.
>         GO-terms) is
>         provided on the Bioconductor page, but:
>         Casneuf et al (BMC Bioinformatics 2007, 8:461) have
>         reannotated the
>         Arabidopsis chip in order to get rid of cross- and
>         nonhybridizing probes,
>         and I am using the custom-made cdf-file to analyze my data.
>         But the Affy_ID
>         that naming the probesets have been replaced by gene accession
>         numbers
>         (Atg-numbers in this case) in the new cdf-file and this makes
>         the annotation
>         from Bioconductor useless to me: the keys used there are Affy_IDs.
>
>         So obviously I have to make new mappings from gene accession
>         numbers to e.g.
>         GO-terms, but that information is available on databases.
>
>         *My questions*: is it worth making a new annotation package
>         for the chip, or
>         could I just create my own environments that contain the
>         mappings (if its
>         only for my project)? What would be less work and still allow
>         the main
>         analyses (e.g. GO-enrichment etc)/be useful for the community?
>         Also I could just try to map the gene accession numbers back
>         to the original
>         Affy_IDs and use the provided annotation package?
>         And: is there an easy-to-read manual on how to create
>         annotation packages (I
>         know there are the vignettes, but I am not a bioinformatician)?
>
>         Thanks a million for any feedback, best wishes, Sam
>
>                [[alternative HTML version deleted]]
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at stat.math.ethz.ch
>         <mailto:Bioconductor at stat.math.ethz.ch>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>          
>
>
>
>



More information about the Bioconductor mailing list