[Bioc-devel] IGV - a new package in preparation

Fri Mar 9 18:59:08 CET 2018

Thanks, Levi. Your comments, and Gabe’s are very helpful, getting me to consider things I have overlooked.

Support for GenomicRanges is essential, as you and Gabe point out.

In all cases IGV will convert a GRanges object to an appropriate track, then write it out as a temporary file.  igv supports bed, gff, gff3, gtf, wig, bigWig, bedGraph, bam, vcf, and seg formats, and a variety of sources:  files via http, google cloud storage, GA4GH; recent limited support has been provided for direct javascript data.   Maybe someday AnnotationHub? 

GenomicRanges as I understand them are very flexible, not subclassed into types as are track formats.  So I propose that in many cases it will be he user’s responsibility to specify track type, call the appropriate constructor, maybe specify column names so that the right scores can be extracted from the mcols - whose names are, so far as I know, are not standardized.

If the GRanges object is too big - greater than a densely packed megabase, for instance, igv works best if the track file is indexed and served up by an index- and CORS-savvy webserver.   Thus the IGV should politely fail - or at least issue a warning -  when encounters big tracks.  This “too big” threshold may change over time.

Reading through Michael’s rtracklayer vignette I came across this:

   The rtracklayer package currently interfaces with the UCSC web-based genome browser. 
   Other packages may provide drivers for other genome browsers through a plugin system.

Can anyone (maybe Michael himself?) comment on how I can evaluate an rtracklayer plugin strategy for igv?  

 - Paul

> On Mar 9, 2018, at 4:15 AM, Levi Waldron <lwaldron.research at gmail.com> wrote:
> 
> On Thu, Mar 8, 2018 at 12:29 AM, Paul Shannon <pshannon at systemsbiology.org> wrote:
> Thanks, Gabe.
> 
> You make an excellent point: bioc objects get first class support.  In some instance, base R data types deserve that also, and data.frames lead the list for me, being useful, concise, universally available, expressive.
> 
> So perhaps not “data.frames replaced by” but “accompanied by” appropriate bioc data types?
> 
>  - Paul
> 
> Definitely +1 for supporting GenomicRanges, including what's in genome() and mcols(). There's a demo of an rtracklayer -> GRanges -> UCSC genome browser workflow in the rtracklayer vignette that I've made use of. I wouldn't necessarily say *don't* support data.frame, but I would certainly encourage Bioc users to import data with rtracklayer instead of generic read* functions, and to take advantage of the vast AnnotationHub and OrganismDbi-based annotations which provide GenomicRanges objects.
> 
> Thanks and looking forward to it!
>