[Bioc-devel] OrganismDb package for Drosophila.melanogaster
Martin Morgan
martin.morgan at roswellpark.org
Tue Nov 15 20:35:38 CET 2016
On 11/15/2016 02:34 PM, Martin Morgan wrote:
> On 11/15/2016 09:52 AM, Obenchain, Valerie wrote:
>> Hi Pariksheet,
>>
>> On 11/15/2016 03:32 AM, Pariksheet Nanda wrote:
>>> Hi folks,
>>>
>>> It would be great to have an OrganismDb package for
>>> Drosophila.melanogaster, similar to Homo.sapiens, Mus.musculus and
>>> Rattus.norvegicus.
>>>
>>> While trying to do this on my own using the Homo.sapiens package as a
>>> starting point, I found the most similar looking keys to relate
>>> org.Dm.eg.db and TxDb.Dmelanogaster.UCSC.dm6.ensGene to be "ENSEMBL" and
>>> "GENEID" though there's a ".1" tacked to the end "GENEID" which makes it
>>> harder to supply the graphInfo object to
>>> OrganismDbi:::.loadOrganismDbiPkg:
>>>
>>> !> key_ <- function(db, key) sort(as.character(
>>> + select(db, keys(db, key), key,
>>> key)[[key]]))
>>> > key_head <- function(db, key) head(key_(db, key))
>>> > key_head(TxDb.Dmelanogaster.UCSC.dm6.ensGene, "GENEID")
>>> 'select()' returned 1:1 mapping between keys and columns
>>> [1] "FBgn0000003.1" "FBgn0000008.1" "FBgn0000014.1" "FBgn0000015.1"
>>> [5] "FBgn0000017.1" "FBgn0000018.1"
>>> > key_head(org.Dm.eg.db, "ENSEMBL")
>>> [1] "FBgn0000008" "FBgn0000014" "FBgn0000015" "FBgn0000017"
>>> "FBgn0000018"
>>> [6] "FBgn0000022"
>>> >
>>>
>>> In other words, like Rattus.norvegicus, it might be good do add a UCSC
>>> "refGene" TxDb package for dm6 as "ensGene" doesn't appear to be as
>>> good of
>>> a candidate (at least without some ugliness)? I was looking at
>>> creating a
>>> dm6 UCSC "refGene" TxDb.
>> You can use GenomicFeatures::makeTxDbFromUCSC() to create the TxDb. The
>> man page, ?makeTxDbFromUCSC, also has helper functions that display
>> available genomes, tables and tracks.
>
> I'm not completely sure of the result, but
>
> library(OrganismDb)
> odb = makeOrganismDbFromUCSC("dm6", tableName="refGene")
^^tablename
>
> might be most of the way there?
>
> Martin
>
>>
>> Valerie
>>
>>> I imagine one would query the UCSC public MySQL
>>> server and then do the SQLite conversion. Although the conversion to
>>> SQLite seems a bit finagly as the datatypes differ between MySQL and
>>> SQLite
>>> and I'm having a hard time finding a well supported tool to do it; I
>>> don't
>>> want to introduce errors or harm reproducibility. What do you use for
>>> MySQL to SQLite conversion? Or would it be more sensible for you
>>> benevolent dictators to generate the package(s)?
>>>
>>> Pariksheet
>>>
>>> ---
>>> Pariksheet Nanda
>>> PhD Candidate in Genetics and Genomics
>>> System Administrator, Storrs HPC Cluster
>>> University of Connecticut
>>>
>>> ---
>>> > sessionInfo()
>>> R Under development (unstable) (2016-11-13 r71655)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>> Running under: Ubuntu 16.04.1 LTS
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats4 parallel stats graphics grDevices utils
>>> datasets
>>> [8] methods base
>>>
>>> other attached packages:
>>> [1] Rattus.norvegicus_1.3.1
>>> [2] TxDb.Rnorvegicus.UCSC.rn5.refGene_3.4.0
>>> [3] org.Rn.eg.db_3.4.0
>>> [4] Mus.musculus_1.3.1
>>> [5] TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.0
>>> [6] org.Mm.eg.db_3.4.0
>>> [7] Homo.sapiens_1.3.1
>>> [8] GO.db_3.4.0
>>> [9] OrganismDbi_1.17.1
>>> [10] org.Hs.eg.db_3.4.0
>>> [11] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
>>> [12] org.Dm.eg.db_3.4.0
>>> [13] TxDb.Dmelanogaster.UCSC.dm6.ensGene_3.3.0
>>> [14] GenomicFeatures_1.27.2
>>> [15] AnnotationDbi_1.37.0
>>> [16] Biobase_2.35.0
>>> [17] GenomicRanges_1.27.6
>>> [18] GenomeInfoDb_1.11.4
>>> [19] IRanges_2.9.8
>>> [20] S4Vectors_0.13.2
>>> [21] BiocGenerics_0.21.0
>>> [22] BiocInstaller_1.25.2
>>>
>>> loaded via a namespace (and not attached):
>>> [1] compiler_3.4.0 XVector_0.15.0
>>> [3] bitops_1.0-6 tools_3.4.0
>>> [5] zlibbioc_1.21.0 biomaRt_2.31.1
>>> [7] RSQLite_1.0.0 lattice_0.20-34
>>> [9] Matrix_1.2-7.1 graph_1.53.0
>>> [11] DBI_0.5-1 rtracklayer_1.35.1
>>> [13] Biostrings_2.43.0 grid_3.4.0
>>> [15] XML_3.98-1.5 RBGL_1.51.0
>>> [17] BiocParallel_1.9.1 Rsamtools_1.27.2
>>> [19] GenomicAlignments_1.11.1 SummarizedExperiment_1.5.3
>>> [21] RCurl_1.95-4.8
>>> >
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>
>>
>>
>> This email message may contain legally privileged and/or confidential
>> information. If you are not the intended recipient(s), or the
>> employee or agent responsible for the delivery of this message to the
>> intended recipient(s), you are hereby notified that any disclosure,
>> copying, distribution, or use of this email message is prohibited. If
>> you have received this message in error, please notify the sender
>> immediately by e-mail and delete this email message from your
>> computer. Thank you.
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
This email message may contain legally privileged and/or...{{dropped:2}}
More information about the Bioc-devel
mailing list