[BioC] annotations for Codelink arrays
John Zhang
jzhang at jimmy.harvard.edu
Mon Oct 17 15:39:33 CEST 2005
>So in this case, if some probes map to differents Entrez Gene ID's (that
>is the case of some of the MULTIPLE probes in this chips, at least with
>the company mappings) then it will be taken only one of the Entrez Gene
>ID's (the smallest). I will have to check the company's mappings for these
>probes to Entrez Gene or maybe not use it at all and be confident on
>AnnBuilder method (best way a think).
One to many mappings is always a problem as far as annotation is concerned.
AnnBuilder makes a choice (may not be the best one) for the users when there are
multiple Entrez Gene mappings for a given probe id. I would like to invite
comments on what would be the best way of handling this situation.
>
>But how can I use a mixture of genebank ids (for most of the probes) and
>unigene ids (for some of them)? If I use "gb" as baseMapType I will not
>get the mapping for the unigene ids. If I use "ug" then the same for the
>genbank ids. Cannot use the unigene ids in otherSrc because this can only
>use Entrez ids. I worked a little with this with no good result. This is
>briefly what I do:
Currently there is no parser for both GB and UniGene ids. I will look into
writing one. The go around for now is probably to map by GB and UG separately
and then merge the results
>
>gb.txt: File with mappings from probe ids to genbank ids.
>Sometimes I used a file ll.txt with mappings from probe ids. to locuslink
>ids (mappings from the company) in otherSrc
It is always a good idea to include otherSrc. AnnBuilder has a voting machenism
that takes the mapping with the most votes from differenct sources.
>
>> library(AnnBuilder)
>> myBase <- file.path("gb.txt")
>> myBaseType <- "gb"
>> mySrcUrls <- getSrcUrl("all", organism="Rattus norvegicus")
>> myDir <- tempdir()
>> ABPkgBuilder(baseName=myBase, srcUrls=mySrcUrls, baseMapType=myBaseType,
>> pkgPath=myDir, organism="Rattus norvegicus", ... other parameters ...)
>
>
>Thank you again for your help. I think this package is great and the best
>way to deal with the nightmare of annotations out there.
>
>D.
>
>
>> >
>> >Thanks.
>> >
>> >D.
>> >
>> >El 13/10/2005, a las 3:14, Robert Gentleman escribió:
>> >
>> >> Hi Tao,
>> >> If the right set of mappings is available to get started, AnnBuilder
>> >> is pretty easy to use. We can help you with the first one or two, and
>> >> are happy to distribute them. If there is more widespread interest
>> >> (and
>> >> they are stable) we can add them to the build process.
>> >>
>> >> Robert
>> >>
>> >> Shi, Tao wrote:
>> >>
>> >>> Any plans to create annotation packages for Codelink arrays?
>> >>>
>> >>> ...Tao
>> >>>
>> >>> _______________________________________________
>> >>> Bioconductor mailing list
>> >>> Bioconductor at stat.math.ethz.ch
>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>>
>> >>>
>> >>
>> >> --
>> >> Robert Gentleman, PhD
>> >> Program in Computational Biology
>> >> Division of Public Health Sciences
>> >> Fred Hutchinson Cancer Research Center
>> >> 1100 Fairview Ave. N, M2-B876
>> >> PO Box 19024
>> >> Seattle, Washington 98109-1024
>> >> 206-667-7700
>> >> rgentlem at fhcrc.org
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at stat.math.ethz.ch
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>
>> >
>> >_______________________________________________
>> >Bioconductor mailing list
>> >Bioconductor at stat.math.ethz.ch
>> >https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>> Jianhua Zhang
>> Department of Medical Oncology
>> Dana-Farber Cancer Institute
>> 44 Binney Street
>> Boston, MA 02115-6084
>>
Jianhua Zhang
Department of Medical Oncology
Dana-Farber Cancer Institute
44 Binney Street
Boston, MA 02115-6084
More information about the Bioconductor
mailing list