[Bioc-devel] show method for CompressedVRangesList-class

Robert Castelo robert.castelo at upf.edu
Wed Feb 25 17:12:21 CET 2015


my current reason to prefer a CompressedVRangesList object over a 
SimpleVRangesList object is that i find one order of magnitude 
difference in creation time in each of these classes of objects:

library(VariantAnnotation)

fl <- system.file("extdata", "CEUtrio.vcf.bgz",
                   package="VariantFiltering")

vcf <- readVcf(fl, genome="hg19")
vr <- as(vcf, "VRanges")
length(vr)
[1] 15000

## create a VRangesList object
system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
    user  system elapsed
   0.247   0.004   0.252

## create a CompressedVRangesList object
system.time(cvrl <- new("CompressedVRangesList", split(vr, 
sampleNames(vr))))
    user  system elapsed
   0.019   0.000   0.019

0.252/0.019
[1] 13.26316

with a larger vcf differences increase:

[... load vcf, coerce to VRanges ...]
length(vr)
[1] 25916

system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
    user  system elapsed
   2.672   0.000   2.676

system.time(cvrl <- new("CompressedVRangesList", split(vr, 
sampleNames(vr))))
    user  system elapsed
   0.014   0.000   0.014

2.676 / 0.014
[1] 191.1429


so maybe i'm using the wrong way to construct a VRangesList object, but 
according to our last conversation about this, there was no obvious 
default fast way to do it, starting from a VRanges object:

https://stat.ethz.ch/pipermail/bioc-devel/2015-January/006905.html

it would be great if there's a fast way to do this kind of construction.

thanks,

robert.

On 02/25/2015 04:42 PM, Michael Lawrence wrote:
> If you're storing data on a relatively small number of individuals (say,
> hundreds), you should use SimpleVRangesList, not CompressedVRangesList.
>
> On Wed, Feb 25, 2015 at 7:10 AM, Robert Castelo <robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>> wrote:
>
>     i see you point, the logic i was thinking about is to use a list of
>     VRanges objects to hold separately the variants of multiple
>     individuals, with one VRanges object per individual.
>
>     if i type the name of such a list object on the R shell, having the
>     GRangesList show method, i feel i do not see much information
>     because the screen just scrolls up tens or hundreds of lines
>     specifiying variants per individual. however, the concise appearance
>     of something like a VRangesList:
>
>      > vrl
>     VRangesList of length 10
>     names(32): S1 S2 S3 S4 ... S7 S8 S9 S10
>
>     at least suggests the user that the object holding the variants has
>     information for 10 samples and belongs to the class 'VRangesList'.
>
>     i thought this made general sense but i'm fine if you feel this
>     interpretation does not warrant such a change.
>
>     cheers,
>
>     robert.
>
>     On 02/25/2015 01:25 AM, Michael Lawrence wrote:
>
>         Why not have the SimpleVRangesList be shown like
>         CompressedVRangesList,
>         for consistency with GRangesList? In other words, the opposite
>         of what
>         you propose. A strong argument could also be made that a
>         SimpleGenomicRangesList should be shown like a GRangesList.
>         Unless there
>         is some aversion to the more verbose output....
>
>         On Tue, Feb 24, 2015 at 2:36 PM, Robert Castelo
>         <robert.castelo at upf.edu <mailto:robert.castelo at upf.edu>
>         <mailto:robert.castelo at upf.edu
>         <mailto:robert.castelo at upf.edu>__>> wrote:
>
>              so, yes, but IMO rather than inheriting the show method from a
>              GRangesList, i think that the show method for
>         CompressedVRangesList
>              objects should be inherited from a VRangesList object.
>         right now
>              this is the situation:
>
>              library(VariantAnnotation)
>
>              example(VRangesList)
>              vrl
>              VRangesList of length 2
>              names(2): sampleA sampleB
>
>              cvrl <- new("CompressedVRangesList", split(vr,
>         sampleNames(vr)))
>              cvrl
>              CompressedVRangesList object of length 2:
>              $a
>              VRanges object with 1 range and 1 metadata column:
>                     seqnames    ranges strand         ref              alt
>              totalDepth       refDepth       altDepth
>         <Rle> <IRanges> <Rle> <character> <characterOrRle> <integerOrRle>
>         <integerOrRle> <integerOrRle>
>                 [1]     chr1    [1, 5]      +           T
>              C             12              5              7
>                       sampleNames softFilterMatrix | tumorSpecific
>         <factorOrRle> <matrix> | <logical>
>                 [1]             a             TRUE |         FALSE
>
>              $b
>              VRanges object with 1 range and 1 metadata column:
>                     seqnames   ranges strand ref alt totalDepth refDepth
>         altDepth
>              sampleNames softFilterMatrix |
>                 [1]     chr2 [10, 20]      +   A   T         17       10
>              6           b            FALSE |
>                     tumorSpecific
>                 [1]          TRUE
>
>              -------
>              seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
>              would it be possible to have the VRangesList show method for
>              CompressedVRangesList objects?
>
>              robert.
>
>
>
>              On 2/24/15 7:24 PM, Michael Lawrence wrote:
>
>                  I think you might be missing an import. It should
>             inherit the
>                  method for GRangesList.
>
>                  On Tue, Feb 24, 2015 at 9:53 AM, Robert Castelo
>             <robert.castelo at upf.edu <mailto:robert.castelo at upf.edu>
>             <mailto:robert.castelo at upf.edu
>             <mailto:robert.castelo at upf.edu>__>> wrote:
>
>                      hi,
>
>                      i'm using the CompressedVRangesList class in
>             VariantFiltering
>                      to hold variants and their annotations across
>             multiple samples
>                      and found that there was no show method for this
>             class (unless
>                      i'm missing the right import here) so i made one within
>                      VariantFiltering by copying&pasting from other
>             similar classes:
>
>                      setMethod("show",
>             signature(object="__CompressedVRangesList"),
>                                function(object) {
>                                  lo <- length(object)
>                                  cat(classNameForDisplay(__object), " of
>             length ",
>                      lo, "\n",
>                                      sep = "")
>                                  if (!is.null(names(object)))
>                                    cat(BiocGenerics:::__labeledLine("names",
>                      names(object)))
>                                })
>
>                      i guess, however, that the right home for this would be
>                      VariantAnnotation. let me know if you consider
>             adding it there
>                      (or somewhere else) and i'll remove it from
>             VariantFiltering.
>
>                      thanks,
>
>                      robert.
>
>                      _________________________________________________
>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>             <mailto:Bioc-devel at r-project.__org
>             <mailto:Bioc-devel at r-project.org>>
>                      mailing list
>             https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>
>
>
>     --
>     Robert Castelo, PhD
>     Associate Professor
>     Dept. of Experimental and Health Sciences
>     Universitat Pompeu Fabra (UPF)
>     Barcelona Biomedical Research Park (PRBB)
>     Dr Aiguader 88
>     E-08003 Barcelona, Spain
>     telf: +34.933.160.514 <tel:%2B34.933.160.514>
>     fax: +34.933.160.550 <tel:%2B34.933.160.550>
>
>

-- 
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550



More information about the Bioc-devel mailing list