[BioC] Creating an OrganismDbi package with a few transcript annotations
Marc Carlson
mcarlson at fhcrc.org
Fri May 24 02:51:30 CEST 2013
Hi Michael,
As usual, Martin is on the right track. The new Schema .pdf stuff is
only if you really need the old school bimaps, and bimaps are not
actually needed for any of the OrganismDbi stuff. So the interface of
keytypes, cols, keys and select really ought to be enough to allow
integration into OrganismDbi...
And, if you also followed the same org package DB schema that we use
everywhere else that would be ideal since in that case, you could just
recycle the methods we have already defined for OrgDb objects... So in
order to do that, a biomart equivalent to
AnntotationForge:::makeOrgDbFromNCBI() and
AnntotationForge:::makeOrgPackageFromNCBI would be a nice addition.
And I agree that a simple data.frame() based underlying implementation
would make this easier to generalize. Right now things are a bit
specialized for NCBI resources.
Marc
On 05/17/2013 09:56 PM, Michael Lawrence wrote:
> Cool, thanks Martin. I'll wait for Marc to get back. If what you say is
> correct, it would be nice to have a simple data frame implementation. I'm
> getting the annotations from a biomart, so a biomart implementation would
> be ideal, although that might be tricky semantically.
>
> Michael
>
>
>
>
> On Fri, May 17, 2013 at 5:43 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
>> On 05/16/2013 01:50 PM, Michael Lawrence wrote:
>>
>>> Hi,
>>>
>>> I'd like to create an OrganismDbi package so that I can put extra
>>> annotations on the transcripts/genes in a TxDb package. My understanding
>>> is
>>> that I need a separate database package that I can join with the TxDb
>>> package. Do I need to make an OrgDb package? I looked into this a bit and
>>> it seems that there is little support for making a non-NCBI-based org
>>> package. Maybe I could create a new type of package with a simple table
>>> with a row for each transcript, including the gene symbol and whether the
>>> transcript is "canonical" according to UCSC. It looks like this process is
>>> documented here:
>>> http://www.bioconductor.org/**packages/2.12/bioc/vignettes/**
>>> AnnotationForge/inst/doc/**NewSchema.pdf<http://www.bioconductor.org/packages/2.12/bioc/vignettes/AnnotationForge/inst/doc/NewSchema.pdf>
>>> .
>>> It also seems really involved. What's the path of least resistance here?
>>>
>> Hi Michael -- Marc is away for a few days. I *think* the idea is that the
>> details in NewSchema are no longer required, rather, implement your extra
>> data in any fashion to provide a 'select' interface, i.e.,
>>
>> keytypes
>> keys
>> cols
>> select
>>
>> following the implied API of ?keytypes. Then create an OrgDb package with
>>
>> AnnotationDbi::**makeOrganismPackage
>>
>> Sorry not to be more definitive in my help.
>>
>> Martin
>>
>>
>>> Thanks,
>>> Michael
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives: http://news.gmane.org/gmane.**
>>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>
>>>
>> --
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list