[BioC] GenomicFeatures::makeTranscriptDbFromGTF?

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu Feb 9 19:17:29 CET 2012


Hi Tim,

A quick answer to one of your questions, specifically:

> And since it's fresh in my mind, how does one automatically attach the
> appropriate BSgenome to a GenomicRanges?  It seems like it should be
> trivial, and the documentation hints at it, but I haven't succeeded yet in
> doing this automatically.

I have a generic method defined somewhere in one of my utility called
`getBsGenome` which loads the appropriate genome of choice. It's not
bullet proof, but is handy.

The genome is identified by its annotation/accession/build/release,
whatever you call it, where I mostly go by UCSC conveintions, ie.
'hg19', 'mm9', etc.

I use it like this

bsg <- getBsGenome('hg19')

And it will load the BSgenome.Hsapiens.UCSC.hg19 library and return
the `Hsapiens` object (that is also, now, in your workspace).

I define methods for different classes that I have that basically end
up calling down to the base function I pasted below.

Assuming you are on bioc2.9, a potential GenomicRanges impl of the
method would look like so (untested):

setMethod("getBsGenome", c(x="GenomicRanges"), function(x, ...) {
  g <- unique(genome(x))
  if (!is.character(g) || length(g) != 1L || nchar(g) < 1) {
    stop("Expected a single, valid genome identifier in seqinfo slot")
  }
  getBsGenome(g, ...)
})

You would have to add more cases in the switch statement of the base
function when you want to expand your reportoire of organisms:

setMethod("getBsGenome", c(x="character"),
function(x, organism=NULL, anno.source='UCSC', ...) {
  lib.name <- 'BSgenome.:organism:.:anno.source:.:genome:'
  if (is.null(organism)) {
    organism <- switch(substring(x, 1, 2),
                       hg='Hsapiens',
                       mm='Mmusculus',
                       sa='Scerevisiae',
                       dm='Dmelanogaster',
                       rn='Rnorvegicus',
                       ce='Celegans',
                       stop("Unknown genome", x, sep=" "))
  }

  lib.name <- gsub(':organism:', organism, lib.name)
  lib.name <- gsub(':anno.source:', anno.source, lib.name)
  lib.name <- gsub(':genome:', x, lib.name)

  suppressPackageStartupMessages({
    found <- require(lib.name, character.only=TRUE)
  })

  if (!found) {
    stop(lib.name, " package required.")
  }

  get(organism, pos=paste('package', lib.name, sep=":"))
})

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list