[Bioc-devel] writeVcf performance

Wed Sep 10 00:09:00 CEST 2014

Hi Herve,

This unit test passes in VA 1.11.30 (the current version in svn). It was 
related to writeVcf(), not the IRanges/S4Vector stuff. My fault, not yours.

Val

On 09/09/2014 02:47 PM, Hervé Pagès wrote:
> Hi Val,
>
> On 09/09/2014 02:12 PM, Valerie Obenchain wrote:
>> Writing 'list' data has been fixed in 1.11.30. fyi, Herve is in the
>> process of moving SimpleList and DataFrame from IRanges to S4Vectors;
>> finished up today I think.
>
> I fixed VariantAnnotation's NAMESPACE this morning but 'R CMD check'
> failed for me because of an unit test error in test_VRanges_vcf().
> Here is how to quickly reproduce:
>
>    library(VariantAnnotation)
>    library(RUnit)
>    source("path/to/VariantAnnotation/inst/unitTests/test_VRanges-class.R")
>
>    dest <- tempfile()
>    vr <- make_TARGET_VRanges_vcf()
>    writeVcf(vr, dest)
>    vcf <- readVcf(dest, genome = "hg19")
>    perm <- c(1, 7, 8, 4, 2, 10)
>    vcf.vr <- as(vcf, "VRanges")[perm]
>    genome(vr) <- "hg19"
>    checkIdenticalVCF(vr, vcf.vr)  # Error in checkIdentical(orig, vcf) :
> FALSE
>
> Hard for me to tell whether this is related to DataFrame moving from
> IRanges to S4Vectors or to a regression in writeVcf(). Do you think
> you can have a look? Thanks for the help and sorry for the trouble.
>
> H.
>
>> Anyhow, if you get VariantAnnotation from svn
>> you'll need to update S4Vectors, IRanges and GenomicRanges (and maybe
>> rtracklayer).
>>
>> I'm working on the 'chunking' approach next. It looks like we can still
>> gain from adding that.
>>
>> Valerie
>>
>> On 09/08/2014 12:00 PM, Valerie Obenchain wrote:
>>> fyi Martin found a bug with the treatment of list data (ie, Number =
>>> '.') in the header. Working on a fix ...
>>>
>>> Val
>>>
>>>
>>> On 09/08/2014 08:43 AM, Gabe Becker wrote:
>>>> Val,
>>>>
>>>> That is great. I'll check this out and test it on our end.
>>>>
>>>> ~G
>>>>
>>>> On Mon, Sep 8, 2014 at 8:38 AM, Valerie Obenchain <vobencha at fhcrc.org
>>>> <mailto:vobencha at fhcrc.org>> wrote:
>>>>
>>>>     The new writeVcf code is in 1.11.28.
>>>>
>>>>     Using the illumina file you suggested, geno fields only, writing
>>>> now
>>>>     takes about 17 minutes.
>>>>
>>>>      > hdr
>>>>     class: VCFHeader
>>>>     samples(1): NA12877
>>>>     meta(6): fileformat ApplyRecalibration ... reference source
>>>>     fixed(1): FILTER
>>>>     info(22): AC AF ... culprit set
>>>>     geno(8): GT GQX ... PL VF
>>>>
>>>>      > param = ScanVcfParam(info=NA)
>>>>      > vcf = readVcf(fl, "", param=param)
>>>>      > dim(vcf)
>>>>     [1] 51612762        1
>>>>
>>>>      > system.time(writeVcf(vcf, "out.vcf"))
>>>>          user   system  elapsed
>>>>       971.032    6.568 1004.593
>>>>
>>>>     In 1.11.28, parsing of geno data was moved to C. If this didn't
>>>>     speed things up enough we were planning to implement 'chunking'
>>>>     through the VCF and/or move the parsing of info to C, however, it
>>>>     looks like geno was the bottleneck.
>>>>
>>>>     I've tested a number of samples/fields combinations in files
>>>> with >=
>>>>     .5 million rows and the improvement over writeVcf() in release is
>>>> ~ 90%.
>>>>
>>>>     Valerie
>>>>
>>>>
>>>>
>>>>
>>>>     On 09/04/14 15:28, Valerie Obenchain wrote:
>>>>
>>>>         Thanks Gabe. I should have something for you on Monday.
>>>>
>>>>         Val
>>>>
>>>>
>>>>         On 09/04/2014 01:56 PM, Gabe Becker wrote:
>>>>
>>>>             Val and Martin,
>>>>
>>>>             Apologies for the delay.
>>>>
>>>>             We realized that the Illumina platinum genome vcf files
>>>> make
>>>>             a good test
>>>>             case, assuming you strip out all the info (info=NA when
>>>>             reading it into
>>>>             R) stuff.
>>>>
>>>>
>>>> ftp://platgene:G3n3s4me@ussd-__ftp.illumina.com/NA12877_S1.__genome.vcf.gz
>>>>
>>>>
>>>>
>>>>
>>>> <ftp://platgene:G3n3s4me@ussd-ftp.illumina.com/NA12877_S1.genome.vcf.gz>
>>>>
>>>>             took about ~4.2 hrs to write out, and is about 1.5x the
>>>> size
>>>>             of the
>>>>             files we are actually dealing with (~50M ranges vs our
>>>> ~30M).
>>>>
>>>>             Looking forward a new vastly improved writeVcf :).
>>>>
>>>>             ~G
>>>>
>>>>
>>>>             On Tue, Sep 2, 2014 at 1:53 PM, Michael Lawrence
>>>>             <lawrence.michael at gene.com
>>>>             <mailto:lawrence.michael at gene.com>
>>>>             <mailto:lawrence.michael at gene.__com
>>>>             <mailto:lawrence.michael at gene.com>>> wrote:
>>>>
>>>>                  Yes, it's very clear that the scaling is non-linear,
>>>>             and Gabe has
>>>>                  been experimenting with a chunk-wise + parallel
>>>> algorithm.
>>>>                  Unfortunately there is some frustrating overhead with
>>>> the
>>>>                  parallelism. But I'm glad Val is arriving at something
>>>>             quicker.
>>>>
>>>>                  Michael
>>>>
>>>>
>>>>                  On Tue, Sep 2, 2014 at 1:33 PM, Martin Morgan
>>>>             <mtmorgan at fhcrc.org <mailto:mtmorgan at fhcrc.org>
>>>>                  <mailto:mtmorgan at fhcrc.org
>>>>             <mailto:mtmorgan at fhcrc.org>>> wrote:
>>>>
>>>>                      On 08/27/2014 11:56 AM, Gabe Becker wrote:
>>>>
>>>>                          The profiling I attached in my previous email
>>>>             is for 24 geno
>>>>                          fields, as I said,
>>>>                          but our typical usecase involves only ~4-6
>>>>             fields, and is
>>>>                          faster but still on
>>>>                          the order of dozens of minutes.
>>>>
>>>>
>>>>                      I think Val is arriving at a (much) more efficient
>>>>                      implementation, but...
>>>>
>>>>                      I wanted to share my guess that the poor _scaling_
>>>>             is because
>>>>                      the garbage collector runs multiple times as the
>>>>             different
>>>>                      strings are pasted together, and has to traverse,
>>>>             in linear
>>>>                      time, increasing numbers of allocated SEXPs. So
>>>>             times scale
>>>>                      approximately quadratically with the number of
>>>> rows
>>>>             in the VCF
>>>>
>>>>                      An efficiency is to reduce the number of SEXPs in
>>>>             play by
>>>>                      writing out in chunks -- as each chunk is written,
>>>>             the SEXPs
>>>>                      become available for collection and are re-used.
>>>>             Here's my toy
>>>>                      example
>>>>
>>>>                      time.R
>>>>                      ======
>>>>                      splitIndices <- function (nx, ncl)
>>>>                      {
>>>>                           i <- seq_len(nx)
>>>>                           if (ncl == 0L)
>>>>                               list()
>>>>                           else if (ncl == 1L || nx == 1L)
>>>>                               list(i)
>>>>                           else {
>>>>                               fuzz <- min((nx - 1L)/1000, 0.4 * nx/ncl)
>>>>                               breaks <- seq(1 - fuzz, nx + fuzz, length
>>>>             = ncl + 1L)
>>>>                               structure(split(i, cut(i, breaks,
>>>>             labels=FALSE)), names
>>>>                      = NULL)
>>>>                           }
>>>>                      }
>>>>
>>>>                      x = as.character(seq_len(1e7)); y = sample(x)
>>>>                      if (!is.na <http://is.na>
>>>>             <http://is.na>(Sys.getenv("__SPLIT", NA))) {
>>>>                           idx <- splitIndices(length(x), 20)
>>>>                           system.time(for (i in idx) paste(x[i], y[i],
>>>>             sep=":"))
>>>>                      } else {
>>>>                           system.time(paste(x, y, sep=":"))
>>>>                      }
>>>>
>>>>
>>>>                      running under R-devel with $ SPLIT=TRUE R
>>>> --no-save
>>>>             --quiet -f
>>>>                      time.R the relevant time is
>>>>
>>>>                          user  system elapsed
>>>>                        15.320   0.064  15.381
>>>>
>>>>                      versus with $ R --no-save --quiet -f time.R it is
>>>>
>>>>                          user  system elapsed
>>>>                        95.360   0.164  95.511
>>>>
>>>>                      I think this is likely an overall strategy when
>>>>             dealing with
>>>>                      character data -- processing in independent chunks
>>>>             of moderate
>>>>                      (1M?) size (enabling as a consequence parallel
>>>>             evaluation in
>>>>                      modest memory) that are sufficient to benefit from
>>>>                      vectorization, but that do not entail
>>>> allocation of
>>>>             large
>>>>                      numbers of in-use SEXPs.
>>>>
>>>>                      Martin
>>>>
>>>>
>>>>                          Sorry for the confusion.
>>>>                          ~G
>>>>
>>>>
>>>>                          On Wed, Aug 27, 2014 at 11:45 AM, Gabe Becker
>>>>                          <beckerg4 at gene.com <mailto:beckerg4 at gene.com>
>>>>             <mailto:beckerg4 at gene.com <mailto:beckerg4 at gene.com>>
>>>>                          <mailto:beckerg4 at gene.com
>>>>             <mailto:beckerg4 at gene.com> <mailto:beckerg4 at gene.com
>>>>             <mailto:beckerg4 at gene.com>>>> wrote:
>>>>
>>>>                               Martin and Val.
>>>>
>>>>                               I re-ran writeVcf on our (G)VCF data
>>>>             (34790518 ranges,
>>>>                          24 geno fields) with
>>>>                               profiling enabled. The results of
>>>>             summaryRprof for that
>>>>                          run are attached,
>>>>                               though for a variety of reasons they are
>>>>             pretty
>>>>             misleading.
>>>>
>>>>                               It took over an hour to write
>>>>             (3700+seconds), so it's
>>>>                          definitely a
>>>>                               bottleneck when the data get very large,
>>>>             even if it
>>>>                          isn't for smaller data.
>>>>
>>>>                               Michael and I both think the culprit is
>>>>             all the pasting
>>>>                          and cbinding that is
>>>>                               going on, and more to the point, that
>>>>             memory for an
>>>>                          internal representation
>>>>                               to be written out is allocated at all.
>>>>             Streaming
>>>>                          across the object, looping
>>>>                               by rows and writing directly to file
>>>> (e.g.
>>>>             from C)
>>>>                          should be blisteringly
>>>>                               fast in comparison.
>>>>
>>>>                               ~G
>>>>
>>>>
>>>>                               On Tue, Aug 26, 2014 at 11:57 AM, Michael
>>>>             Lawrence
>>>>                          <michafla at gene.com <mailto:michafla at gene.com>
>>>>             <mailto:michafla at gene.com <mailto:michafla at gene.com>>
>>>>                               <mailto:michafla at gene.com
>>>>             <mailto:michafla at gene.com> <mailto:michafla at gene.com
>>>>             <mailto:michafla at gene.com>>>>
>>>>                          wrote:
>>>>
>>>>                                   Gabe is still testing/profiling, but
>>>>             we'll send
>>>>                          something randomized
>>>>                                   along eventually.
>>>>
>>>>
>>>>                                   On Tue, Aug 26, 2014 at 11:15 AM,
>>>>             Martin Morgan
>>>>                          <mtmorgan at fhcrc.org
>>>> <mailto:mtmorgan at fhcrc.org>
>>>>             <mailto:mtmorgan at fhcrc.org <mailto:mtmorgan at fhcrc.org>>
>>>>                                   <mailto:mtmorgan at fhcrc.org
>>>>             <mailto:mtmorgan at fhcrc.org>
>>>>                          <mailto:mtmorgan at fhcrc.org
>>>>             <mailto:mtmorgan at fhcrc.org>>>> wrote:
>>>>
>>>>                                       I didn't see in the original
>>>> thread a
>>>>                          reproducible (simulated, I
>>>>                                       guess) example, to be explicit
>>>>             about what the
>>>>                          problem is??
>>>>
>>>>                                       Martin
>>>>
>>>>
>>>>                                       On 08/26/2014 10:47 AM, Michael
>>>>             Lawrence wrote:
>>>>
>>>>                                           My understanding is that the
>>>> heap
>>>>                          optimization provided marginal
>>>>                                           gains, and
>>>>                                           that we need to think harder
>>>>             about how to
>>>>                          optimize the all of
>>>>                                           the string
>>>>                                           manipulation in writeVcf. We
>>>>             either need to
>>>>                          reduce it or reduce its
>>>>                                           overhead (i.e., the CHARSXP
>>>>             allocation).
>>>>                          Gabe is doing more tests.
>>>>
>>>>
>>>>                                           On Tue, Aug 26, 2014 at 9:43
>>>>             AM, Valerie
>>>>                          Obenchain
>>>>                                           <vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>
>>>>                          <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>> <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>
>>>>                          <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>>>>
>>>>
>>>>                                           wrote:
>>>>
>>>>                                               Hi Gabe,
>>>>
>>>>                                               Martin responded, and so
>>>>             did Michael,
>>>>
>>>>
>>>>
>>>> https://stat.ethz.ch/______pipermail/bioc-devel/2014-______August/006082.html
>>>>
>>>>
>>>>
>>>>
>>>> <https://stat.ethz.ch/____pipermail/bioc-devel/2014-____August/006082.html>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> <https://stat.ethz.ch/____pipermail/bioc-devel/2014-____August/006082.html
>>>>
>>>>
>>>>
>>>>
>>>> <https://stat.ethz.ch/__pipermail/bioc-devel/2014-__August/006082.html>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> <https://stat.ethz.ch/____pipermail/bioc-devel/2014-____August/006082.html
>>>>
>>>>
>>>>
>>>>
>>>> <https://stat.ethz.ch/__pipermail/bioc-devel/2014-__August/006082.html>
>>>>
>>>>
>>>> <https://stat.ethz.ch/__pipermail/bioc-devel/2014-__August/006082.html
>>>>
>>>> <https://stat.ethz.ch/pipermail/bioc-devel/2014-August/006082.html>>>
>>>>
>>>>                                               It sounded like Michael
>>>>             was ok with
>>>>                          working with/around heap
>>>>                                               initialization.
>>>>
>>>>                                               Michael, is that right or
>>>>             should we
>>>>                          still consider this on
>>>>                                               the table?
>>>>
>>>>
>>>>                                               Val
>>>>
>>>>
>>>>                                               On 08/26/2014 09:34 AM,
>>>>             Gabe Becker
>>>>             wrote:
>>>>
>>>>                                                   Val,
>>>>
>>>>                                                   Has there been any
>>>>             movement on
>>>>                          this? This remains a
>>>>                                                   substantial
>>>>                                                   bottleneck for us
>>>> when
>>>>             writing very
>>>>                          large VCF files (e.g.
>>>>                                                   variants+genotypes
>>>> for
>>>>             whole genome
>>>>                          NGS samples).
>>>>
>>>>                                                   I was able to see a
>>>>             ~25% speedup
>>>>                          with 4 cores and  an
>>>>                                                   "optimal" speedup
>>>>                                                   of ~2x with 10-12
>>>>             cores for a VCF
>>>>                          with 500k rows  using
>>>>                                                   a very naive
>>>>                                                   parallelization
>>>>             strategy and no
>>>>                          other changes. I suspect
>>>>                                                   this could be
>>>>                                                   improved on quite a
>>>>             bit, or
>>>>                          possibly made irrelevant
>>>>                                                   with judicious use
>>>>                                                   of serial C code.
>>>>
>>>>                                                   Did you and Martin
>>>>             make any plans
>>>>                          regarding optimizing
>>>>                                                   writeVcf?
>>>>
>>>>                                                   Best
>>>>                                                   ~G
>>>>
>>>>
>>>>                                                   On Tue, Aug 5,
>>>> 2014 at
>>>>             2:33 PM,
>>>>                          Valerie Obenchain
>>>>                                                   <vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>
>>>>                          <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>> <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>
>>>>                          <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>>>
>>>>
>>>>               <mailto:vobencha at fhcrc.org <mailto:vobencha at fhcrc.org>
>>>>                          <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>> <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>
>>>>                          <mailto:vobencha at fhcrc.org
>>>>             <mailto:vobencha at fhcrc.org>>>>>
>>>>
>>>>                                                   wrote:
>>>>
>>>>                                                         Hi Michael,
>>>>
>>>>                                                         I'm interested
>>>>             in working on
>>>>                          this. I'll discuss
>>>>                                                   with Martin next
>>>>                                                         week when we're
>>>>             both back in
>>>>                          the office.
>>>>
>>>>                                                         Val
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                                                         On 08/05/14
>>>>             07:46, Michael
>>>>                          Lawrence wrote:
>>>>
>>>>                                                             Hi guys
>>>>             (Val, Martin,
>>>>             Herve):
>>>>
>>>>                                                             Anyone have
>>>>             an itch for
>>>>                          optimization? The
>>>>                                                   writeVcf function is
>>>>                                                             currently a
>>>>                                                             bottleneck
>>>>             in our WGS
>>>>                          genotyping pipeline. For
>>>>                                                   a typical 50
>>>>                                                             million row
>>>>                                                             gVCF, it
>>>> was
>>>>             taking 2.25
>>>>                          hours prior to
>>>>                                                   yesterday's
>>>> improvements
>>>>
>>>>               (pasteCollapseRows) that
>>>>                          brought it down to
>>>>                                                   about 1 hour, which
>>>>                                                             is still
>>>>                                                             too long by
>>>>             my standards
>>>>                          (> 0). Only takes 3
>>>>                                                   minutes to call the
>>>>                                                             genotypes
>>>>                                                             (and
>>>> associated
>>>>                          likelihoods etc) from the
>>>>                                                   variant calls (using
>>>>                                                             80 cores
>>>> and
>>>>                                                             450 GB RAM
>>>>             on one node),
>>>>                          so the output is an
>>>>                                                   issue. Profiling
>>>>                                                             suggests
>>>> that
>>>>                                                             the running
>>>>             time scales
>>>>                          non-linearly in the
>>>>                                                   number of rows.
>>>>
>>>>                                                             Digging a
>>>>             little deeper,
>>>>                          it seems to be
>>>>                                                   something with R's
>>>>
>>>> string/memory
>>>>                                                             allocation.
>>>>             Below,
>>>>                          pasting 1 million strings
>>>>                                                   takes 6 seconds, but
>>>>                                                   10
>>>>                                                             million
>>>>             strings takes
>>>>                          over 2 minutes. It gets
>>>>                                                   way worse with 50
>>>>                                                             million. I
>>>>                                                             suspect it
>>>>             has something
>>>>                          to do with R's string
>>>>                                                   hash table.
>>>>
>>>>
>>>> set.seed(1000)
>>>>                                                             end <-
>>>>             sample(1e8, 1e6)
>>>>
>>>>               system.time(paste0("END",
>>>>                          "=", end))
>>>>                                                                  user
>>>>             system elapsed
>>>>                                                                 6.396
>>>>               0.028   6.420
>>>>
>>>>                                                             end <-
>>>>             sample(1e8, 1e7)
>>>>
>>>>               system.time(paste0("END",
>>>>                          "=", end))
>>>>                                                                  user
>>>>             system elapsed
>>>>                                                             134.714
>>>>               0.352 134.978
>>>>
>>>>                                                             Indeed,
>>>> even
>>>>             this takes a
>>>>                          long time (in a
>>>>                                                   fresh session):
>>>>
>>>>
>>>> set.seed(1000)
>>>>                                                             end <-
>>>>             sample(1e8, 1e6)
>>>>                                                             end <-
>>>>             sample(1e8, 1e7)
>>>>
>>>>                          system.time(as.character(end))
>>>>                                                                  user
>>>>             system elapsed
>>>>                                                                57.224
>>>>               0.156  57.366
>>>>
>>>>                                                             But running
>>>>             it a second
>>>>                          time is faster (about
>>>>                                                   what one would
>>>>                                                             expect?):
>>>>
>>>>
>>>>               system.time(levels <-
>>>>                          as.character(end))
>>>>                                                                  user
>>>>             system elapsed
>>>>                                                                23.582
>>>>               0.021  23.589
>>>>
>>>>                                                             I did some
>>>>             simple
>>>>                          profiling of R to find that
>>>>                                                   the resizing of
>>>>                                                             the string
>>>>                                                             hash table
>>>>             is not a
>>>>                          significant component of
>>>>                                                   the time. So maybe
>>>>                                                             something
>>>>                                                             to do with
>>>>             the R heap/gc?
>>>>                          No time right now to
>>>>                                                   go deeper. But I
>>>>                                                             know Martin
>>>>                                                             likes this
>>>>             sort of
>>>>             thing ;)
>>>>
>>>>                                                             Michael
>>>>
>>>>
>>>>             [[alternative
>>>>                          HTML version deleted]]
>>>>
>>>>
>>>>
>>>>
>>>>             _______________________________________________________
>>>>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>             <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>
>>>>
>>>>               <mailto:Bioc-devel at r-project.
>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>>
>>>>
>>>>               <mailto:Bioc-devel at r-project
>>>> <mailto:Bioc-devel at r-project>.
>>>>                          <mailto:Bioc-devel at r-project
>>>>             <mailto:Bioc-devel at r-project>.>______org
>>>>
>>>>               <mailto:Bioc-devel at r-project.
>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>>>
>>>>                                                             mailing
>>>> list
>>>>
>>>>             https://stat.ethz.ch/mailman/________listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>>
>>>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>>
>>>>
>>>>
>>>>
>>>>
>>>>             _______________________________________________________
>>>>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>             <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>
>>>>
>>>>               <mailto:Bioc-devel at r-project.
>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>>
>>>>
>>>>               <mailto:Bioc-devel at r-project
>>>> <mailto:Bioc-devel at r-project>.
>>>>                          <mailto:Bioc-devel at r-project
>>>>             <mailto:Bioc-devel at r-project>.>______org
>>>>
>>>>               <mailto:Bioc-devel at r-project.
>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>>> mailing
>>>>                                                   list
>>>>             https://stat.ethz.ch/mailman/________listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                                                   --
>>>>                                                   Computational
>>>> Biologist
>>>>                                                   Genentech Research
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                                                    [[alternative HTML
>>>>             version
>>>>             deleted]]
>>>>
>>>>
>>>>
>>>>             _____________________________________________________
>>>>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>             <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>
>>>>                          <mailto:Bioc-devel at r-project.
>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>             <mailto:Bioc-devel at r-project.org>>>
>>>>                                           mailing list
>>>>             https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>>>>
>>>>
>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>>>
>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
>>>>
>>>>
>>>>
>>>>                                       --
>>>>                                       Computational Biology / Fred
>>>>             Hutchinson Cancer
>>>>                          Research Center
>>>>                                       1100 Fairview Ave. N.
>>>>                                       PO Box 19024 Seattle, WA 98109
>>>>
>>>>                                       Location: Arnold Building M1 B861
>>>>                                       Phone: (206) 667-2793
>>>>             <tel:%28206%29%20667-2793>
>>>>                          <tel:%28206%29%20667-2793>
>>>>             <tel:%28206%29%20667-2793>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>                               --
>>>>                               Computational Biologist
>>>>                               Genentech Research
>>>>
>>>>
>>>>
>>>>
>>>>                          --
>>>>                          Computational Biologist
>>>>                          Genentech Research
>>>>
>>>>
>>>>
>>>>                      --
>>>>                      Computational Biology / Fred Hutchinson Cancer
>>>>             Research Center
>>>>                      1100 Fairview Ave. N.
>>>>                      PO Box 19024 Seattle, WA 98109
>>>>
>>>>                      Location: Arnold Building M1 B861
>>>>                      Phone: (206) 667-2793 <tel:%28206%29%20667-2793>
>>>>             <tel:%28206%29%20667-2793>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>             --
>>>>             Computational Biologist
>>>>             Genentech Research
>>>>
>>>>
>>>>         _________________________________________________
>>>>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>         mailing list
>>>>         https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Computational Biologist
>>>> Genentech Research
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Valerie Obenchain
Program in Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: vobencha at fhcrc.org
Phone: (206) 667-3158