[Bioc-devel] writeVcf performance

Wed Sep 10 02:03:04 CEST 2014

Ah, ok. I should have 'svn up' and re-tried 'R CMD check' before
reporting this. Thanks and sorry for the noise.

H.

On 09/09/2014 03:09 PM, Valerie Obenchain wrote:
> Hi Herve,
>
> This unit test passes in VA 1.11.30 (the current version in svn). It was
> related to writeVcf(), not the IRanges/S4Vector stuff. My fault, not yours.
>
> Val
>
> On 09/09/2014 02:47 PM, Hervé Pagès wrote:
>> Hi Val,
>>
>> On 09/09/2014 02:12 PM, Valerie Obenchain wrote:
>>> Writing 'list' data has been fixed in 1.11.30. fyi, Herve is in the
>>> process of moving SimpleList and DataFrame from IRanges to S4Vectors;
>>> finished up today I think.
>>
>> I fixed VariantAnnotation's NAMESPACE this morning but 'R CMD check'
>> failed for me because of an unit test error in test_VRanges_vcf().
>> Here is how to quickly reproduce:
>>
>>    library(VariantAnnotation)
>>    library(RUnit)
>>
>> source("path/to/VariantAnnotation/inst/unitTests/test_VRanges-class.R")
>>
>>    dest <- tempfile()
>>    vr <- make_TARGET_VRanges_vcf()
>>    writeVcf(vr, dest)
>>    vcf <- readVcf(dest, genome = "hg19")
>>    perm <- c(1, 7, 8, 4, 2, 10)
>>    vcf.vr <- as(vcf, "VRanges")[perm]
>>    genome(vr) <- "hg19"
>>    checkIdenticalVCF(vr, vcf.vr)  # Error in checkIdentical(orig, vcf) :
>> FALSE
>>
>> Hard for me to tell whether this is related to DataFrame moving from
>> IRanges to S4Vectors or to a regression in writeVcf(). Do you think
>> you can have a look? Thanks for the help and sorry for the trouble.
>>
>> H.
>>
>>> Anyhow, if you get VariantAnnotation from svn
>>> you'll need to update S4Vectors, IRanges and GenomicRanges (and maybe
>>> rtracklayer).
>>>
>>> I'm working on the 'chunking' approach next. It looks like we can still
>>> gain from adding that.
>>>
>>> Valerie
>>>
>>> On 09/08/2014 12:00 PM, Valerie Obenchain wrote:
>>>> fyi Martin found a bug with the treatment of list data (ie, Number =
>>>> '.') in the header. Working on a fix ...
>>>>
>>>> Val
>>>>
>>>>
>>>> On 09/08/2014 08:43 AM, Gabe Becker wrote:
>>>>> Val,
>>>>>
>>>>> That is great. I'll check this out and test it on our end.
>>>>>
>>>>> ~G
>>>>>
>>>>> On Mon, Sep 8, 2014 at 8:38 AM, Valerie Obenchain <vobencha at fhcrc.org
>>>>> <mailto:vobencha at fhcrc.org>> wrote:
>>>>>
>>>>>     The new writeVcf code is in 1.11.28.
>>>>>
>>>>>     Using the illumina file you suggested, geno fields only, writing
>>>>> now
>>>>>     takes about 17 minutes.
>>>>>
>>>>>      > hdr
>>>>>     class: VCFHeader
>>>>>     samples(1): NA12877
>>>>>     meta(6): fileformat ApplyRecalibration ... reference source
>>>>>     fixed(1): FILTER
>>>>>     info(22): AC AF ... culprit set
>>>>>     geno(8): GT GQX ... PL VF
>>>>>
>>>>>      > param = ScanVcfParam(info=NA)
>>>>>      > vcf = readVcf(fl, "", param=param)
>>>>>      > dim(vcf)
>>>>>     [1] 51612762        1
>>>>>
>>>>>      > system.time(writeVcf(vcf, "out.vcf"))
>>>>>          user   system  elapsed
>>>>>       971.032    6.568 1004.593
>>>>>
>>>>>     In 1.11.28, parsing of geno data was moved to C. If this didn't
>>>>>     speed things up enough we were planning to implement 'chunking'
>>>>>     through the VCF and/or move the parsing of info to C, however, it
>>>>>     looks like geno was the bottleneck.
>>>>>
>>>>>     I've tested a number of samples/fields combinations in files
>>>>> with >=
>>>>>     .5 million rows and the improvement over writeVcf() in release is
>>>>> ~ 90%.
>>>>>
>>>>>     Valerie
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     On 09/04/14 15:28, Valerie Obenchain wrote:
>>>>>
>>>>>         Thanks Gabe. I should have something for you on Monday.
>>>>>
>>>>>         Val
>>>>>
>>>>>
>>>>>         On 09/04/2014 01:56 PM, Gabe Becker wrote:
>>>>>
>>>>>             Val and Martin,
>>>>>
>>>>>             Apologies for the delay.
>>>>>
>>>>>             We realized that the Illumina platinum genome vcf files
>>>>> make
>>>>>             a good test
>>>>>             case, assuming you strip out all the info (info=NA when
>>>>>             reading it into
>>>>>             R) stuff.
>>>>>
>>>>>
>>>>> ftp://platgene:G3n3s4me@ussd-__ftp.illumina.com/NA12877_S1.__genome.vcf.gz
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <ftp://platgene:G3n3s4me@ussd-ftp.illumina.com/NA12877_S1.genome.vcf.gz>
>>>>>
>>>>>
>>>>>             took about ~4.2 hrs to write out, and is about 1.5x the
>>>>> size
>>>>>             of the
>>>>>             files we are actually dealing with (~50M ranges vs our
>>>>> ~30M).
>>>>>
>>>>>             Looking forward a new vastly improved writeVcf :).
>>>>>
>>>>>             ~G
>>>>>
>>>>>
>>>>>             On Tue, Sep 2, 2014 at 1:53 PM, Michael Lawrence
>>>>>             <lawrence.michael at gene.com
>>>>>             <mailto:lawrence.michael at gene.com>
>>>>>             <mailto:lawrence.michael at gene.__com
>>>>>             <mailto:lawrence.michael at gene.com>>> wrote:
>>>>>
>>>>>                  Yes, it's very clear that the scaling is non-linear,
>>>>>             and Gabe has
>>>>>                  been experimenting with a chunk-wise + parallel
>>>>> algorithm.
>>>>>                  Unfortunately there is some frustrating overhead with
>>>>> the
>>>>>                  parallelism. But I'm glad Val is arriving at
>>>>> something
>>>>>             quicker.
>>>>>
>>>>>                  Michael
>>>>>
>>>>>
>>>>>                  On Tue, Sep 2, 2014 at 1:33 PM, Martin Morgan
>>>>>             <mtmorgan at fhcrc.org <mailto:mtmorgan at fhcrc.org>
>>>>>                  <mailto:mtmorgan at fhcrc.org
>>>>>             <mailto:mtmorgan at fhcrc.org>>> wrote:
>>>>>
>>>>>                      On 08/27/2014 11:56 AM, Gabe Becker wrote:
>>>>>
>>>>>                          The profiling I attached in my previous email
>>>>>             is for 24 geno
>>>>>                          fields, as I said,
>>>>>                          but our typical usecase involves only ~4-6
>>>>>             fields, and is
>>>>>                          faster but still on
>>>>>                          the order of dozens of minutes.
>>>>>
>>>>>
>>>>>                      I think Val is arriving at a (much) more
>>>>> efficient
>>>>>                      implementation, but...
>>>>>
>>>>>                      I wanted to share my guess that the poor
>>>>> _scaling_
>>>>>             is because
>>>>>                      the garbage collector runs multiple times as the
>>>>>             different
>>>>>                      strings are pasted together, and has to traverse,
>>>>>             in linear
>>>>>                      time, increasing numbers of allocated SEXPs. So
>>>>>             times scale
>>>>>                      approximately quadratically with the number of
>>>>> rows
>>>>>             in the VCF
>>>>>
>>>>>                      An efficiency is to reduce the number of SEXPs in
>>>>>             play by
>>>>>                      writing out in chunks -- as each chunk is
>>>>> written,
>>>>>             the SEXPs
>>>>>                      become available for collection and are re-used.
>>>>>             Here's my toy
>>>>>                      example
>>>>>
>>>>>                      time.R
>>>>>                      ======
>>>>>                      splitIndices <- function (nx, ncl)
>>>>>                      {
>>>>>                           i <- seq_len(nx)
>>>>>                           if (ncl == 0L)
>>>>>                               list()
>>>>>                           else if (ncl == 1L || nx == 1L)
>>>>>                               list(i)
>>>>>                           else {
>>>>>                               fuzz <- min((nx - 1L)/1000, 0.4 *
>>>>> nx/ncl)
>>>>>                               breaks <- seq(1 - fuzz, nx + fuzz,
>>>>> length
>>>>>             = ncl + 1L)
>>>>>                               structure(split(i, cut(i, breaks,
>>>>>             labels=FALSE)), names
>>>>>                      = NULL)
>>>>>                           }
>>>>>                      }
>>>>>
>>>>>                      x = as.character(seq_len(1e7)); y = sample(x)
>>>>>                      if (!is.na <http://is.na>
>>>>>             <http://is.na>(Sys.getenv("__SPLIT", NA))) {
>>>>>                           idx <- splitIndices(length(x), 20)
>>>>>                           system.time(for (i in idx) paste(x[i], y[i],
>>>>>             sep=":"))
>>>>>                      } else {
>>>>>                           system.time(paste(x, y, sep=":"))
>>>>>                      }
>>>>>
>>>>>
>>>>>                      running under R-devel with $ SPLIT=TRUE R
>>>>> --no-save
>>>>>             --quiet -f
>>>>>                      time.R the relevant time is
>>>>>
>>>>>                          user  system elapsed
>>>>>                        15.320   0.064  15.381
>>>>>
>>>>>                      versus with $ R --no-save --quiet -f time.R it is
>>>>>
>>>>>                          user  system elapsed
>>>>>                        95.360   0.164  95.511
>>>>>
>>>>>                      I think this is likely an overall strategy when
>>>>>             dealing with
>>>>>                      character data -- processing in independent
>>>>> chunks
>>>>>             of moderate
>>>>>                      (1M?) size (enabling as a consequence parallel
>>>>>             evaluation in
>>>>>                      modest memory) that are sufficient to benefit
>>>>> from
>>>>>                      vectorization, but that do not entail
>>>>> allocation of
>>>>>             large
>>>>>                      numbers of in-use SEXPs.
>>>>>
>>>>>                      Martin
>>>>>
>>>>>
>>>>>                          Sorry for the confusion.
>>>>>                          ~G
>>>>>
>>>>>
>>>>>                          On Wed, Aug 27, 2014 at 11:45 AM, Gabe Becker
>>>>>                          <beckerg4 at gene.com <mailto:beckerg4 at gene.com>
>>>>>             <mailto:beckerg4 at gene.com <mailto:beckerg4 at gene.com>>
>>>>>                          <mailto:beckerg4 at gene.com
>>>>>             <mailto:beckerg4 at gene.com> <mailto:beckerg4 at gene.com
>>>>>             <mailto:beckerg4 at gene.com>>>> wrote:
>>>>>
>>>>>                               Martin and Val.
>>>>>
>>>>>                               I re-ran writeVcf on our (G)VCF data
>>>>>             (34790518 ranges,
>>>>>                          24 geno fields) with
>>>>>                               profiling enabled. The results of
>>>>>             summaryRprof for that
>>>>>                          run are attached,
>>>>>                               though for a variety of reasons they are
>>>>>             pretty
>>>>>             misleading.
>>>>>
>>>>>                               It took over an hour to write
>>>>>             (3700+seconds), so it's
>>>>>                          definitely a
>>>>>                               bottleneck when the data get very large,
>>>>>             even if it
>>>>>                          isn't for smaller data.
>>>>>
>>>>>                               Michael and I both think the culprit is
>>>>>             all the pasting
>>>>>                          and cbinding that is
>>>>>                               going on, and more to the point, that
>>>>>             memory for an
>>>>>                          internal representation
>>>>>                               to be written out is allocated at all.
>>>>>             Streaming
>>>>>                          across the object, looping
>>>>>                               by rows and writing directly to file
>>>>> (e.g.
>>>>>             from C)
>>>>>                          should be blisteringly
>>>>>                               fast in comparison.
>>>>>
>>>>>                               ~G
>>>>>
>>>>>
>>>>>                               On Tue, Aug 26, 2014 at 11:57 AM,
>>>>> Michael
>>>>>             Lawrence
>>>>>                          <michafla at gene.com <mailto:michafla at gene.com>
>>>>>             <mailto:michafla at gene.com <mailto:michafla at gene.com>>
>>>>>                               <mailto:michafla at gene.com
>>>>>             <mailto:michafla at gene.com> <mailto:michafla at gene.com
>>>>>             <mailto:michafla at gene.com>>>>
>>>>>                          wrote:
>>>>>
>>>>>                                   Gabe is still testing/profiling, but
>>>>>             we'll send
>>>>>                          something randomized
>>>>>                                   along eventually.
>>>>>
>>>>>
>>>>>                                   On Tue, Aug 26, 2014 at 11:15 AM,
>>>>>             Martin Morgan
>>>>>                          <mtmorgan at fhcrc.org
>>>>> <mailto:mtmorgan at fhcrc.org>
>>>>>             <mailto:mtmorgan at fhcrc.org <mailto:mtmorgan at fhcrc.org>>
>>>>>                                   <mailto:mtmorgan at fhcrc.org
>>>>>             <mailto:mtmorgan at fhcrc.org>
>>>>>                          <mailto:mtmorgan at fhcrc.org
>>>>>             <mailto:mtmorgan at fhcrc.org>>>> wrote:
>>>>>
>>>>>                                       I didn't see in the original
>>>>> thread a
>>>>>                          reproducible (simulated, I
>>>>>                                       guess) example, to be explicit
>>>>>             about what the
>>>>>                          problem is??
>>>>>
>>>>>                                       Martin
>>>>>
>>>>>
>>>>>                                       On 08/26/2014 10:47 AM, Michael
>>>>>             Lawrence wrote:
>>>>>
>>>>>                                           My understanding is that the
>>>>> heap
>>>>>                          optimization provided marginal
>>>>>                                           gains, and
>>>>>                                           that we need to think harder
>>>>>             about how to
>>>>>                          optimize the all of
>>>>>                                           the string
>>>>>                                           manipulation in writeVcf. We
>>>>>             either need to
>>>>>                          reduce it or reduce its
>>>>>                                           overhead (i.e., the CHARSXP
>>>>>             allocation).
>>>>>                          Gabe is doing more tests.
>>>>>
>>>>>
>>>>>                                           On Tue, Aug 26, 2014 at 9:43
>>>>>             AM, Valerie
>>>>>                          Obenchain
>>>>>                                           <vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>
>>>>>                          <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>> <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>
>>>>>                          <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>>>>
>>>>>
>>>>>                                           wrote:
>>>>>
>>>>>                                               Hi Gabe,
>>>>>
>>>>>                                               Martin responded, and so
>>>>>             did Michael,
>>>>>
>>>>>
>>>>>
>>>>> https://stat.ethz.ch/______pipermail/bioc-devel/2014-______August/006082.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <https://stat.ethz.ch/____pipermail/bioc-devel/2014-____August/006082.html>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <https://stat.ethz.ch/____pipermail/bioc-devel/2014-____August/006082.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <https://stat.ethz.ch/__pipermail/bioc-devel/2014-__August/006082.html>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <https://stat.ethz.ch/____pipermail/bioc-devel/2014-____August/006082.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> <https://stat.ethz.ch/__pipermail/bioc-devel/2014-__August/006082.html>
>>>>>
>>>>>
>>>>>
>>>>> <https://stat.ethz.ch/__pipermail/bioc-devel/2014-__August/006082.html
>>>>>
>>>>> <https://stat.ethz.ch/pipermail/bioc-devel/2014-August/006082.html>>>
>>>>>
>>>>>                                               It sounded like Michael
>>>>>             was ok with
>>>>>                          working with/around heap
>>>>>                                               initialization.
>>>>>
>>>>>                                               Michael, is that
>>>>> right or
>>>>>             should we
>>>>>                          still consider this on
>>>>>                                               the table?
>>>>>
>>>>>
>>>>>                                               Val
>>>>>
>>>>>
>>>>>                                               On 08/26/2014 09:34 AM,
>>>>>             Gabe Becker
>>>>>             wrote:
>>>>>
>>>>>                                                   Val,
>>>>>
>>>>>                                                   Has there been any
>>>>>             movement on
>>>>>                          this? This remains a
>>>>>                                                   substantial
>>>>>                                                   bottleneck for us
>>>>> when
>>>>>             writing very
>>>>>                          large VCF files (e.g.
>>>>>                                                   variants+genotypes
>>>>> for
>>>>>             whole genome
>>>>>                          NGS samples).
>>>>>
>>>>>                                                   I was able to see a
>>>>>             ~25% speedup
>>>>>                          with 4 cores and  an
>>>>>                                                   "optimal" speedup
>>>>>                                                   of ~2x with 10-12
>>>>>             cores for a VCF
>>>>>                          with 500k rows  using
>>>>>                                                   a very naive
>>>>>                                                   parallelization
>>>>>             strategy and no
>>>>>                          other changes. I suspect
>>>>>                                                   this could be
>>>>>                                                   improved on quite a
>>>>>             bit, or
>>>>>                          possibly made irrelevant
>>>>>                                                   with judicious use
>>>>>                                                   of serial C code.
>>>>>
>>>>>                                                   Did you and Martin
>>>>>             make any plans
>>>>>                          regarding optimizing
>>>>>                                                   writeVcf?
>>>>>
>>>>>                                                   Best
>>>>>                                                   ~G
>>>>>
>>>>>
>>>>>                                                   On Tue, Aug 5,
>>>>> 2014 at
>>>>>             2:33 PM,
>>>>>                          Valerie Obenchain
>>>>>                                                   <vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>
>>>>>                          <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>> <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>
>>>>>                          <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>>>
>>>>>
>>>>>               <mailto:vobencha at fhcrc.org <mailto:vobencha at fhcrc.org>
>>>>>                          <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>> <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>
>>>>>                          <mailto:vobencha at fhcrc.org
>>>>>             <mailto:vobencha at fhcrc.org>>>>>
>>>>>
>>>>>                                                   wrote:
>>>>>
>>>>>                                                         Hi Michael,
>>>>>
>>>>>                                                         I'm interested
>>>>>             in working on
>>>>>                          this. I'll discuss
>>>>>                                                   with Martin next
>>>>>                                                         week when
>>>>> we're
>>>>>             both back in
>>>>>                          the office.
>>>>>
>>>>>                                                         Val
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                                         On 08/05/14
>>>>>             07:46, Michael
>>>>>                          Lawrence wrote:
>>>>>
>>>>>                                                             Hi guys
>>>>>             (Val, Martin,
>>>>>             Herve):
>>>>>
>>>>>                                                             Anyone
>>>>> have
>>>>>             an itch for
>>>>>                          optimization? The
>>>>>                                                   writeVcf function is
>>>>>
>>>>> currently a
>>>>>                                                             bottleneck
>>>>>             in our WGS
>>>>>                          genotyping pipeline. For
>>>>>                                                   a typical 50
>>>>>                                                             million
>>>>> row
>>>>>                                                             gVCF, it
>>>>> was
>>>>>             taking 2.25
>>>>>                          hours prior to
>>>>>                                                   yesterday's
>>>>> improvements
>>>>>
>>>>>               (pasteCollapseRows) that
>>>>>                          brought it down to
>>>>>                                                   about 1 hour, which
>>>>>                                                             is still
>>>>>                                                             too
>>>>> long by
>>>>>             my standards
>>>>>                          (> 0). Only takes 3
>>>>>                                                   minutes to call the
>>>>>                                                             genotypes
>>>>>                                                             (and
>>>>> associated
>>>>>                          likelihoods etc) from the
>>>>>                                                   variant calls (using
>>>>>                                                             80 cores
>>>>> and
>>>>>                                                             450 GB RAM
>>>>>             on one node),
>>>>>                          so the output is an
>>>>>                                                   issue. Profiling
>>>>>                                                             suggests
>>>>> that
>>>>>                                                             the
>>>>> running
>>>>>             time scales
>>>>>                          non-linearly in the
>>>>>                                                   number of rows.
>>>>>
>>>>>                                                             Digging a
>>>>>             little deeper,
>>>>>                          it seems to be
>>>>>                                                   something with R's
>>>>>
>>>>> string/memory
>>>>>
>>>>> allocation.
>>>>>             Below,
>>>>>                          pasting 1 million strings
>>>>>                                                   takes 6 seconds, but
>>>>>                                                   10
>>>>>                                                             million
>>>>>             strings takes
>>>>>                          over 2 minutes. It gets
>>>>>                                                   way worse with 50
>>>>>                                                             million. I
>>>>>                                                             suspect it
>>>>>             has something
>>>>>                          to do with R's string
>>>>>                                                   hash table.
>>>>>
>>>>>
>>>>> set.seed(1000)
>>>>>                                                             end <-
>>>>>             sample(1e8, 1e6)
>>>>>
>>>>>               system.time(paste0("END",
>>>>>                          "=", end))
>>>>>                                                                  user
>>>>>             system elapsed
>>>>>                                                                 6.396
>>>>>               0.028   6.420
>>>>>
>>>>>                                                             end <-
>>>>>             sample(1e8, 1e7)
>>>>>
>>>>>               system.time(paste0("END",
>>>>>                          "=", end))
>>>>>                                                                  user
>>>>>             system elapsed
>>>>>                                                             134.714
>>>>>               0.352 134.978
>>>>>
>>>>>                                                             Indeed,
>>>>> even
>>>>>             this takes a
>>>>>                          long time (in a
>>>>>                                                   fresh session):
>>>>>
>>>>>
>>>>> set.seed(1000)
>>>>>                                                             end <-
>>>>>             sample(1e8, 1e6)
>>>>>                                                             end <-
>>>>>             sample(1e8, 1e7)
>>>>>
>>>>>                          system.time(as.character(end))
>>>>>                                                                  user
>>>>>             system elapsed
>>>>>                                                                57.224
>>>>>               0.156  57.366
>>>>>
>>>>>                                                             But
>>>>> running
>>>>>             it a second
>>>>>                          time is faster (about
>>>>>                                                   what one would
>>>>>                                                             expect?):
>>>>>
>>>>>
>>>>>               system.time(levels <-
>>>>>                          as.character(end))
>>>>>                                                                  user
>>>>>             system elapsed
>>>>>                                                                23.582
>>>>>               0.021  23.589
>>>>>
>>>>>                                                             I did some
>>>>>             simple
>>>>>                          profiling of R to find that
>>>>>                                                   the resizing of
>>>>>                                                             the string
>>>>>                                                             hash table
>>>>>             is not a
>>>>>                          significant component of
>>>>>                                                   the time. So maybe
>>>>>                                                             something
>>>>>                                                             to do with
>>>>>             the R heap/gc?
>>>>>                          No time right now to
>>>>>                                                   go deeper. But I
>>>>>                                                             know
>>>>> Martin
>>>>>                                                             likes this
>>>>>             sort of
>>>>>             thing ;)
>>>>>
>>>>>                                                             Michael
>>>>>
>>>>>
>>>>>             [[alternative
>>>>>                          HTML version deleted]]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             _______________________________________________________
>>>>>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>>             <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>
>>>>>
>>>>>               <mailto:Bioc-devel at r-project.
>>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>>
>>>>>
>>>>>               <mailto:Bioc-devel at r-project
>>>>> <mailto:Bioc-devel at r-project>.
>>>>>                          <mailto:Bioc-devel at r-project
>>>>>             <mailto:Bioc-devel at r-project>.>______org
>>>>>
>>>>>               <mailto:Bioc-devel at r-project.
>>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>>>
>>>>>                                                             mailing
>>>>> list
>>>>>
>>>>>             https://stat.ethz.ch/mailman/________listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>>
>>>>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             _______________________________________________________
>>>>>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>>             <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>
>>>>>
>>>>>               <mailto:Bioc-devel at r-project.
>>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>>
>>>>>
>>>>>               <mailto:Bioc-devel at r-project
>>>>> <mailto:Bioc-devel at r-project>.
>>>>>                          <mailto:Bioc-devel at r-project
>>>>>             <mailto:Bioc-devel at r-project>.>______org
>>>>>
>>>>>               <mailto:Bioc-devel at r-project.
>>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>>> mailing
>>>>>                                                   list
>>>>>             https://stat.ethz.ch/mailman/________listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                                   --
>>>>>                                                   Computational
>>>>> Biologist
>>>>>                                                   Genentech Research
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                                    [[alternative HTML
>>>>>             version
>>>>>             deleted]]
>>>>>
>>>>>
>>>>>
>>>>>             _____________________________________________________
>>>>>             Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>>             <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>
>>>>>                          <mailto:Bioc-devel at r-project.
>>>>>             <mailto:Bioc-devel at r-project.>____org
>>>>>                          <mailto:Bioc-devel at r-project.__org
>>>>>             <mailto:Bioc-devel at r-project.org>>>
>>>>>                                           mailing list
>>>>>             https://stat.ethz.ch/mailman/______listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>>
>>>>>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/____listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>>>>
>>>>>             <https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>>             <https://stat.ethz.ch/mailman/listinfo/bioc-devel>>>
>>>>>
>>>>>
>>>>>
>>>>>                                       --
>>>>>                                       Computational Biology / Fred
>>>>>             Hutchinson Cancer
>>>>>                          Research Center
>>>>>                                       1100 Fairview Ave. N.
>>>>>                                       PO Box 19024 Seattle, WA 98109
>>>>>
>>>>>                                       Location: Arnold Building M1
>>>>> B861
>>>>>                                       Phone: (206) 667-2793
>>>>>             <tel:%28206%29%20667-2793>
>>>>>                          <tel:%28206%29%20667-2793>
>>>>>             <tel:%28206%29%20667-2793>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                               --
>>>>>                               Computational Biologist
>>>>>                               Genentech Research
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                          --
>>>>>                          Computational Biologist
>>>>>                          Genentech Research
>>>>>
>>>>>
>>>>>
>>>>>                      --
>>>>>                      Computational Biology / Fred Hutchinson Cancer
>>>>>             Research Center
>>>>>                      1100 Fairview Ave. N.
>>>>>                      PO Box 19024 Seattle, WA 98109
>>>>>
>>>>>                      Location: Arnold Building M1 B861
>>>>>                      Phone: (206) 667-2793 <tel:%28206%29%20667-2793>
>>>>>             <tel:%28206%29%20667-2793>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>             --
>>>>>             Computational Biologist
>>>>>             Genentech Research
>>>>>
>>>>>
>>>>>         _________________________________________________
>>>>>         Bioc-devel at r-project.org <mailto:Bioc-devel at r-project.org>
>>>>>         mailing list
>>>>>         https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>>>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Computational Biologist
>>>>> Genentech Research
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319