[Bioc-devel] show method for CompressedVRangesList-class
Robert Castelo
robert.castelo at upf.edu
Wed Feb 25 17:12:21 CET 2015
my current reason to prefer a CompressedVRangesList object over a
SimpleVRangesList object is that i find one order of magnitude
difference in creation time in each of these classes of objects:
library(VariantAnnotation)
fl <- system.file("extdata", "CEUtrio.vcf.bgz",
package="VariantFiltering")
vcf <- readVcf(fl, genome="hg19")
vr <- as(vcf, "VRanges")
length(vr)
[1] 15000
## create a VRangesList object
system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
user system elapsed
0.247 0.004 0.252
## create a CompressedVRangesList object
system.time(cvrl <- new("CompressedVRangesList", split(vr,
sampleNames(vr))))
user system elapsed
0.019 0.000 0.019
0.252/0.019
[1] 13.26316
with a larger vcf differences increase:
[... load vcf, coerce to VRanges ...]
length(vr)
[1] 25916
system.time(vrl <- do.call("VRangesList", split(vr, sampleNames(vr))))
user system elapsed
2.672 0.000 2.676
system.time(cvrl <- new("CompressedVRangesList", split(vr,
sampleNames(vr))))
user system elapsed
0.014 0.000 0.014
2.676 / 0.014
[1] 191.1429
so maybe i'm using the wrong way to construct a VRangesList object, but
according to our last conversation about this, there was no obvious
default fast way to do it, starting from a VRanges object:
https://stat.ethz.ch/pipermail/bioc-devel/2015-January/006905.html
it would be great if there's a fast way to do this kind of construction.
thanks,
robert.
On 02/25/2015 04:42 PM, Michael Lawrence wrote:
> If you're storing data on a relatively small number of individuals (say,
> hundreds), you should use SimpleVRangesList, not CompressedVRangesList.
>
> On Wed, Feb 25, 2015 at 7:10 AM, Robert Castelo <robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>> wrote:
>
> i see you point, the logic i was thinking about is to use a list of
> VRanges objects to hold separately the variants of multiple
> individuals, with one VRanges object per individual.
>
> if i type the name of such a list object on the R shell, having the
> GRangesList show method, i feel i do not see much information
> because the screen just scrolls up tens or hundreds of lines
> specifiying variants per individual. however, the concise appearance
> of something like a VRangesList:
>
> > vrl
> VRangesList of length 10
> names(32): S1 S2 S3 S4 ... S7 S8 S9 S10
>
> at least suggests the user that the object holding the variants has
> information for 10 samples and belongs to the class 'VRangesList'.
>
> i thought this made general sense but i'm fine if you feel this
> interpretation does not warrant such a change.
>
> cheers,
>
> robert.
>
> On 02/25/2015 01:25 AM, Michael Lawrence wrote:
>
> Why not have the SimpleVRangesList be shown like
> CompressedVRangesList,
> for consistency with GRangesList? In other words, the opposite
> of what
> you propose. A strong argument could also be made that a
> SimpleGenomicRangesList should be shown like a GRangesList.
> Unless there
> is some aversion to the more verbose output....
>
> On Tue, Feb 24, 2015 at 2:36 PM, Robert Castelo
> <robert.castelo at upf.edu <mailto:robert.castelo at upf.edu>
> <mailto:robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>__>> wrote:
>
> so, yes, but IMO rather than inheriting the show method from a
> GRangesList, i think that the show method for
> CompressedVRangesList
> objects should be inherited from a VRangesList object.
> right now
> this is the situation:
>
> library(VariantAnnotation)
>
> example(VRangesList)
> vrl
> VRangesList of length 2
> names(2): sampleA sampleB
>
> cvrl <- new("CompressedVRangesList", split(vr,
> sampleNames(vr)))
> cvrl
> CompressedVRangesList object of length 2:
> $a
> VRanges object with 1 range and 1 metadata column:
> seqnames ranges strand ref alt
> totalDepth refDepth altDepth
> <Rle> <IRanges> <Rle> <character> <characterOrRle> <integerOrRle>
> <integerOrRle> <integerOrRle>
> [1] chr1 [1, 5] + T
> C 12 5 7
> sampleNames softFilterMatrix | tumorSpecific
> <factorOrRle> <matrix> | <logical>
> [1] a TRUE | FALSE
>
> $b
> VRanges object with 1 range and 1 metadata column:
> seqnames ranges strand ref alt totalDepth refDepth
> altDepth
> sampleNames softFilterMatrix |
> [1] chr2 [10, 20] + A T 17 10
> 6 b FALSE |
> tumorSpecific
> [1] TRUE
>
> -------
> seqinfo: 2 sequences from an unspecified genome; no seqlengths
>
> would it be possible to have the VRangesList show method for
> CompressedVRangesList objects?
>
> robert.
>
>
>
> On 2/24/15 7:24 PM, Michael Lawrence wrote:
>
> I think you might be missing an import. It should
> inherit the
> method for GRangesList.
>
> On Tue, Feb 24, 2015 at 9:53 AM, Robert Castelo
> <robert.castelo at upf.edu <mailto:robert.castelo at upf.edu>
> <mailto:robert.castelo at upf.edu
> <mailto:robert.castelo at upf.edu>__>> wrote:
>
> hi,
>
> i'm using the CompressedVRangesList class in
> VariantFiltering
> to hold variants and their annotations across
> multiple samples
> and found that there was no show method for this
> class (unless
> i'm missing the right import here) so i made one within
> VariantFiltering by copying&pasting from other
> similar classes:
>
> setMethod("show",
> signature(object="__CompressedVRangesList"),
> function(object) {
> lo <- length(object)
> cat(classNameForDisplay(__object), " of
> length ",
> lo, "\n",
> sep = "")
> if (!is.null(names(object)))
> cat(BiocGenerics:::__labeledLine("names",
> names(object)))
> })
>
> i guess, however, that the right home for this would be
> VariantAnnotation. let me know if you consider
> adding it there
> (or somewhere else) and i'll remove it from
> VariantFiltering.
>
> thanks,
>
> robert.
>
> _________________________________________________
> Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
> <mailto:Bioc-devel at r-project.__org
> <mailto:Bioc-devel at r-project.org>>
> mailing list
> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>
>
>
>
>
> --
> Robert Castelo, PhD
> Associate Professor
> Dept. of Experimental and Health Sciences
> Universitat Pompeu Fabra (UPF)
> Barcelona Biomedical Research Park (PRBB)
> Dr Aiguader 88
> E-08003 Barcelona, Spain
> telf: +34.933.160.514 <tel:%2B34.933.160.514>
> fax: +34.933.160.550 <tel:%2B34.933.160.550>
>
>
--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550
More information about the Bioc-devel
mailing list