[Bioc-devel] GenomicFiles: chunking

Mon Oct 27 19:53:08 CET 2014

Looks like the test code didn't make it through. Attaching again ...

On 10/27/2014 11:35 AM, Valerie Obenchain wrote:
> Hi Kasper and Mike,
>
> I've added 2 new functions to GenomicFiles and deprecated the old
> classes. The vignette now has a graphic which (hopefully) clarifies the
> MAP / REDUCE mechanics of the different functions.
>
> Below is some performance testing for the new functions and answers to
> leftover questions from previous emails.
>
>
> Major changes to GenomicFiles (in devel):
>
> - *FileViews classes have been deprecated:
>
> The idea is to use the GenomicFiles class to hold any type of file be it
> BAM, BigWig, character vector etc. instead of having name-specific
> classes like BigWigFileViews. Currently GenomicFiles does not inherit
> from SummarizedExperiment but it may in the future.
>
> - Add reduceFiles() and reduceRanges():
>
> These functions pass all ranges or files to MAP vs the lapply approach
> taken in reduceByFiles() and reduceByRanges().
>
>
> (1) Performance:
>
> When testing with reduceByFile() you noted "GenomicFiles" is 10-20x
> slower than the straightforward approach". You also noted this was
> probably because of the lapply over all ranges - true. (Most likely
> there was overhead in creating the SE object as well.) With the new
> reduceFiles(), passing all ranges at once, we see performance very
> similar to that of the 'by hand' approach.
>
> In the test code I've used Bam instead of BigWig. Both test functions
> output lists, have comparable MAP and REDUCE steps etc.
>
> I used 5 files ('fls') and a granges ('grs') of length 100.
>  > length(grs)
> [1] 100
>
>  > sum(width(grs))
> [1] 1000000
>
> FUN1 is the 'by hand' version. These results are similar to what you
> saw, not quite a 4x difference between 10 and 100 ranges.
>
>  >> microbenchmark(FUN1(grs[1:10], fls), FUN1(grs[1:100], fls), times=10)
>  > Unit: seconds
>  >                   expr      min       lq     mean   median       uq
>      max
>  >   FUN1(grs[1:10], fls) 1.177858 1.190239 1.206127 1.201331 1.222298
> 1.256741
>  >  FUN1(grs[1:100], fls) 4.145503 4.163404 4.249619 4.208486 4.278463
> 4.533846
>  >  neval
>  >     10
>  >     10
>
> FUN2 is the reduceFiles() approach and the results are very similar to
> FUN1.
>
>  >> microbenchmark(FUN2(grs[1:10], fls), FUN2(grs[1:100], fls), times=10)
>  > Unit: seconds
>  >                   expr      min       lq     mean   median       uq
>      max
>  >   FUN2(grs[1:10], fls) 1.242767 1.251188 1.257531 1.253154 1.267655
> 1.275698
>  >  FUN2(grs[1:100], fls) 4.251010 4.340061 4.390290 4.361007 4.384064
> 4.676068
>  >  neval
>  >     10
>  >     10
>
>
> (2) Request for "chunking of the mapping of ranges":
>
> For now we decided not to add a 'yieldSize' argument for chunking. There
> are 2 approaches to chunking through ranges *within* the same file. In
> both cases the user spits the ranges, either before calling the function
> or in the MAP step.
>
> i) reduceByFile() with a GRangesList:
>
> The user provides a GRangesList as the 'ranges' arg. On each worker,
> lapply applies MAP to the one files and all elements of the GRangesList.
>
> ii) reduceFiles() with a MAP that handles chunking:
>
> The user split ranges in MAP and uses lapply or another loop to iterate.
> For example,
>
> MAP <- function(range, file, ...) {
>      lst = split(range, someFactor)
>      someFUN = function(rngs, file, ...) do something
>      lapply(lst, FUN=someFun, ...)
> }
>
> The same ideas apply for chunking though ranges *across* files with
> reduceByRange() and reduceRanges().
>
> iii) reduceByRange() with a GRangesList:
>
> Mike has a good example here:
> https://gist.github.com/mikelove/deaff999984dc75f125d
>
> iv) reduceRanges():
>
> 'ranges' should be a GRangesList. The MAP step will operate on an
> element of the GRangesList and all files. Unless you want to operate on
> all files at once I'd use reduceByRange() instead.
>
>
> (3) Return objects have different shape:
>
> Previous question:
>
> "...
> Why does the return object of reduceByFile vs reduceByRange (with
> summarize = FALSE) different?  I understand why internally you have
> different nesting schemes (files and ranges) for the two functions, but
> it is not clear to me that it is desirable to have the return object
> depend on how the computation was done.
> ..."
>
> reduceByFile() and reduceFiles() output a list the same length as the
> number of files while reduceByRange() and reduceRanges() output a list
> the same length as the number of ranges.
>
> Reduction is different depending on which function is chosen; data are
> collapsed either within a file or across files. When REDUCE does
> something substantial the output are not equivalent.
>
> While it's possible to get the same result (REDUCE simply unlists or
> isn't used), the two approaches were not intended to be equivalent ways
> of arriving at the same end. The idea was that the user had a specific
> use case in mind - they either wanted to collapse the data across or
> within files.
>
>
> (4) return object from coverage(BigWigFileViews):
>
> Previous comment:
>
> "...
> coverage(BigWigFileViews) returns a "wrong" assay object in my opinion,
> ...
> Specifically, each (i,j) entry in the object is an RleList with a single
> element with a name equal to the seqnames of the i'th entry in the query
> GRanges.  To me, this extra nestedness is unnecessary; I would have
> expected an Rle instead of an RleList with 1 element.
> ..."
>
> The return value from coverage(x) is an RleList with one coverage vector
> per seqlevel in 'x'. Even if there is only one seqlevel, the result
> still comes back as an RleList. This is just the default behavior.
>
>
> (5) separate the 'read' function from the MAP step
>
> Previous comment:
>
> "...
> Also, something completely different, it seems like it would be
> convenient for stuff like BigWigFileViews to not have to actually parse
> the file in the MAP step.  Somehow I would envision some kind of reading
> function, stored inside the object, which just returns an Rle when I ask
> for a (range, file).  Perhaps this is better left for later.
> ..."
>
> The current approach for the reduce* functions is for MAP to both
> extract and manipulate data. The idea of separating the extraction step
> is actually implemented in reduceByYield(). (This function used to be
> yieldReduce() in Rsamtools in past releases.) For reduceByYield() teh
> user must specify YIELD (a reader function), MAP, REDUCE and DONE
> (criteria to stop iteration).
>
> I'm not sure what is best here. I thought the many-argument approach of
> reduceByYield() was possibly confusing or burdensome and so didn't use
> it in the other GenomicFiles functions. Maybe it's not confusing but
> instead makes the individual steps more clear. What do you think,
>
> - Should the reader function be separate from the MAP? What are the
> advantages?
>
> - Should READER, MAP, REDUCE be stored inside the GenomicFiles object or
> supplied as arguments to the functions?
>
>
> (6) unnamed assay in SummarizedExperiment
>
> Previous comment:
>
> "...
> The return object of reduceByRange / reduceByFile with summarize = TRUE
> is a SummarizedExperiment with an unnamed assay.  I was surprised to see
> that this is even possible.
> ..."
>
> There is no default name for SummarizedExperiment in general. I've named
> the assay 'data' for lack of a better term. We could also go with
> 'reducedData' or another suggestion.
>
>
> Thanks for the feedback.
>
> Valerie
>
>
>
> On 10/01/2014 08:30 AM, Michael Love wrote:
>> hi Kasper,
>>
>> For a concrete example, I posted a R and Rout file here:
>>
>> https://gist.github.com/mikelove/deaff999984dc75f125d
>>
>> Things to note: 'ranges' is a GRangesList, I cbind() the numeric
>> vectors in the REDUCE, and then rbind() the final list to get the
>> desired matrix.
>>
>> Other than the weird column name 'init', does this give you what you
>> want?
>>
>> best,
>>
>> Mike
>>
>> On Tue, Sep 30, 2014 at 2:08 PM, Michael Love
>> <michaelisaiahlove at gmail.com> wrote:
>>> hi Kasper and Valerie,
>>>
>>> In Kasper's original email:
>>>
>>> "I would like to be able to write a MAP function which takes
>>>    ranges, file
>>> instead of just
>>>    range, file
>>> And then chunk over say 1,000s of ranges. I could then have an
>>> argument to reduceByXX called something like rangeSize, which is kind
>>> of yieldSize."
>>>
>>> I was thinking, for your BigWig example, we get part of the way just
>>> by splitting the ranges into a GRangesList of your desired chunk size.
>>> Then the parallel iteration is over chunks of GRanges. Within your MAP
>>> function:
>>>
>>> import(file, which = ranges, as = "Rle", format = "bw")[ranges]
>>>
>>> returns an RleList, and calling mean() on this gives a numeric vector
>>> of the mean of the coverage for each range.
>>>
>>> Then the only work needed is how to package the result into something
>>> reasonable. What you would get now is a list (length of GRangesList)
>>> of lists (length of files) of vectors (length of GRanges).
>>>
>>> best,
>>>
>>> Mike
>>>
>>> On Mon, Sep 29, 2014 at 7:09 PM, Valerie Obenchain
>>> <vobencha at fhcrc.org> wrote:
>>>> Hi Kasper,
>>>>
>>>> The reduceBy* functions were intended to combine data across ranges or
>>>> files. If we have 10 ranges and 3 files you can think of it as a 10
>>>> x 3 grid
>>>> where we'll have 30 queries and therefore 30 pieces of information
>>>> that can
>>>> be combined across different dimensions.
>>>>
>>>> The request to 'processes ranges in one operation' is a good
>>>> suggestion and,
>>>> as you say, may even be the more common use case.
>>>>
>>>> Some action items and explanations:
>>>>
>>>> 1)  treat ranges / files as a group
>>>>
>>>> I'll add functions to treat all ranges / files as a group;
>>>> essentially no
>>>> REDUCER other than concatenation. Chunking (optional) will occur as
>>>> defined
>>>> by 'yieldSize' in the BamFile.
>>>>
>>>> 2)  class clean-up
>>>>
>>>> The GenomicFileViews class was the first iteration. It was overly
>>>> complicated and the complication didn't gain us much. In my mind the
>>>> GenomicFiles class is a striped down version should replace the
>>>> *FileViews
>>>> classes. I plan to deprecate the *FileViews classes.
>>>>
>>>> Ideally the class(s) in GenomicFiles would inherit from
>>>> SummarizedExperiment. A stand-alone package for SummarizedExperiment
>>>> is in
>>>> the works for the near future. Until those plans are finalized I'd
>>>> rather
>>>> not change the inheritance structure in GenomicFiles.
>>>>
>>>> 3) pack / unpack
>>>>
>>>> I experimented with this an came to the conclusion that packing /
>>>> unpacking
>>>> should be left to the user vs being done behind the scenes with
>>>> reduceBy*.
>>>> The primary difficulty is that you don't have pre-knowledge of the
>>>> output
>>>> format of the user's MAPPER. If MAPPER outputs an Rle, unpacking may be
>>>> straightforward. If MAPPER outputs a NumericList or matrix or vector
>>>> with no
>>>> genomic coordinates then things are more complicated.
>>>>
>>>> I'm open if others have suggestions / prototypes.
>>>>
>>>> 4) reduceByYield
>>>>
>>>> This was intended for chunking through ranges in a single file. You can
>>>> imagine using bplapply() over files where each file is chunked
>>>> through with
>>>> reduceByYield(). All ranges are chunked by 'yieldSize' defined in the
>>>> BamFile unless a 'param' dictates a subset of ranges.
>>>>
>>>> 5) additional tidy
>>>>
>>>> I'll add a name to the 'assays' when summarize=TRUE, make sure
>>>> return types
>>>> are consistent when summarize=FALSE, etc.
>>>>
>>>> Thanks for the testing and feedback.
>>>>
>>>>
>>>> Valerie
>>>>
>>>>
>>>> On 09/29/2014 07:18 AM, Michael Love wrote:
>>>>>
>>>>> On Mon, Sep 29, 2014 at 9:09 AM, Kasper Daniel Hansen
>>>>> <kasperdanielhansen at gmail.com> wrote:
>>>>>>
>>>>>> I don't fully understand "the use case for reducing by range is
>>>>>> when the
>>>>>> entire dataset won't fit into memory".  The basic assumption of these
>>>>>> functions (as far as I can see) is that the output data fits in
>>>>>> memory.
>>>>>> What may not fit in memory is various earlier "iterations" of the
>>>>>> data.
>>>>>> For
>>>>>> example, in my use case, if I just read in all the data in all the
>>>>>> ranges
>>>>>> in
>>>>>> all the samples it is basically Rle's across 450MB times 38 files,
>>>>>> which
>>>>>> is
>>>>>> not small.  What fits in memory is smaller chunks of this; that is
>>>>>> true
>>>>>> for
>>>>>> every application.
>>>>>>
>>>>>
>>>>> I was unclear. I meant that, in approach1, you have an object,
>>>>> all.Rle, which contains Rles for every range over every file. Can you
>>>>> actually run this approach on the full dataset?
>>>>>
>>>>>> Reducing by range (or file) only makes sense when the final output
>>>>>> includes
>>>>>> one entity for several ranges/files ... right?  So I don't see how
>>>>>> reduce
>>>>>> would help me.
>>>>>>
>>>>>
>>>>> Yes, I think we agree. This is not a good use case for reduce by range
>>>>> as now implemented.
>>>>>
>>>>> This is a use case which would benefit from the user-facing function
>>>>> internally calling pack()/unpack() to reduce the number of import()
>>>>> calls, and then in the end giving back the mean coverage over the
>>>>> input ranges. I want this too.
>>>>>
>>>>>
>>>>> https://github.com/Bioconductor/GenomicFileViews/issues/2#issuecomment-32625456
>>>>>
>>>>> (link to the old github repo, the new github repo is named
>>>>> GenomicFiles)
>>>>>
>>>>>> As I see the pack()/unpack() paradigm, it just re-orders the query
>>>>>> ranges
>>>>>> (which is super nice and matters a lot for speed for some
>>>>>> applications).
>>>>>> As
>>>>>> I understand the code (and my understanding is developing) we need an
>>>>>> extra
>>>>>> layer to support processing multiple ranges in one operation.
>>>>>>
>>>>>> I am happy to help apart from complaining.
>>>>>>
>>>>>> Best,
>>>>>> Kasper
>>>>>>
>>>>>> On Mon, Sep 29, 2014 at 8:55 AM, Michael Love
>>>>>> <michaelisaiahlove at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Thanks for checking it out and benchmarking. We should be more clear
>>>>>>> in the docs that the use case for reducing by range is when the
>>>>>>> entire
>>>>>>> dataset won't fit into memory. Also, we had some discussion and
>>>>>>> Valerie had written up methods for packing up the ranges supplied by
>>>>>>> the user into a better form for querying files. In your case it
>>>>>>> would
>>>>>>> have packed many ranges together, to reduce the number of import()
>>>>>>> calls like your naive approach. See the pack/unpack functions, which
>>>>>>> are not in the vignette but are in the man pages. If I remember,
>>>>>>> writing code to unpack() the result was not so simple, and
>>>>>>> development
>>>>>>> of these functions was set aside for the moment.
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>>> On Sun, Sep 28, 2014 at 10:49 PM, Kasper Daniel Hansen
>>>>>>> <kasperdanielhansen at gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I am testing GenomicFiles.
>>>>>>>>
>>>>>>>> My use case: I have 231k ranges of average width 1.9kb and total
>>>>>>>> width
>>>>>>>> 442
>>>>>>>> MB.  I also have 38 BigWig files.  I want to compute the average
>>>>>>>> coverage
>>>>>>>> of the 38 BigWig files inside each range.  This is similar to
>>>>>>>> wanting
>>>>>>>> to
>>>>>>>> get coverage of - say - all promoters in the human genome.
>>>>>>>>
>>>>>>>> My data is residing on a file server which is connected to the
>>>>>>>> compute
>>>>>>>> node
>>>>>>>> through ethernet, so basically I have very slow file access.
>>>>>>>> Also. the
>>>>>>>> BigWig files are (perhaps unusually) big: ~2GB / piece.
>>>>>>>>
>>>>>>>> Below I have two approaches, one using basically straightforward
>>>>>>>> code
>>>>>>>> and
>>>>>>>> the other using GenomicFiles (examples are 10 / 100 ranges on 5
>>>>>>>> files).
>>>>>>>> Basically GenomicFiles is 10-20x slower than the straightforward
>>>>>>>> approach.
>>>>>>>> This is most likely because reduceByRange/reduceByFile processes
>>>>>>>> each
>>>>>>>> (range, file) separately.
>>>>>>>>
>>>>>>>> It seems naturally (to me) to allow some chunking of the mapping
>>>>>>>> of the
>>>>>>>> ranges.  My naive approach is fast (I assume) because I read
>>>>>>>> multiple
>>>>>>>> ranges through one import() command.  I know I have random
>>>>>>>> access to
>>>>>>>> the
>>>>>>>> BigWig files, but there must still be some overhead of seeking and
>>>>>>>> perhaps
>>>>>>>> more importantly instantiating/moving stuff back and forth to
>>>>>>>> R.  So
>>>>>>>> basically I would like to be able to write a MAP function which
>>>>>>>> takes
>>>>>>>>     ranges, file
>>>>>>>> instead of just
>>>>>>>>     range, file
>>>>>>>> And then chunk over say 1,000s of ranges.  I could then have an
>>>>>>>> argument
>>>>>>>> to
>>>>>>>> reduceByXX called something like rangeSize, which is kind of
>>>>>>>> yieldSize.
>>>>>>>>
>>>>>>>> Perhaps this is what is intended for the reduceByYield on BigWig
>>>>>>>> files?
>>>>>>>>
>>>>>>>> In a way, this is what is done in the vignette with the
>>>>>>>> coverage(BAMFILE)
>>>>>>>> example where tileGenome is essentially constructed by the user to
>>>>>>>> chunk
>>>>>>>> the coverage computation.
>>>>>>>>
>>>>>>>> I think the example of a set of regions I am querying on the files,
>>>>>>>> will
>>>>>>>> be
>>>>>>>> an extremely common usecase going forward. The raw data for all the
>>>>>>>> regions
>>>>>>>> together is "too big" but I do a computation on each region to
>>>>>>>> reduce
>>>>>>>> the
>>>>>>>> size.  In this situation, all the focus is on the MAP step.  I
>>>>>>>> see the
>>>>>>>> need
>>>>>>>> for REDUCE in the case of the t-test example in the vignette,
>>>>>>>> where the
>>>>>>>> return object is a single "thing" for each base.  But in general, I
>>>>>>>> think
>>>>>>>> we will use these file structures a lot to construct things without
>>>>>>>> REDUCE
>>>>>>>> (across neither files nor ranges).
>>>>>>>>
>>>>>>>> Also, something completely different, it seems like it would be
>>>>>>>> convenient
>>>>>>>> for stuff like BigWigFileViews to not have to actually parse the
>>>>>>>> file
>>>>>>>> in
>>>>>>>> the MAP step.  Somehow I would envision some kind of reading
>>>>>>>> function,
>>>>>>>> stored inside the object, which just returns an Rle when I ask
>>>>>>>> for a
>>>>>>>> (range, file).  Perhaps this is better left for later.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Kasper
>>>>>>>>
>>>>>>>> Examples
>>>>>>>>
>>>>>>>> approach1 <- function(ranges, files) {
>>>>>>>>       ## By hand
>>>>>>>>       all.Rle <- lapply(files, function(file) {
>>>>>>>>           rle <- import(file, as = "Rle", which = ranges, format =
>>>>>>>> "bw")[ranges]
>>>>>>>>           rle
>>>>>>>>       })
>>>>>>>>       print(object.size(all.Rle), units = "auto")
>>>>>>>>       mat <- do.call(cbind, lapply(all.Rle, function(xx) {
>>>>>>>>           sapply(xx, mean)
>>>>>>>>       }))
>>>>>>>>       invisible(mat)
>>>>>>>> }
>>>>>>>>
>>>>>>>> system.time({
>>>>>>>> mat1 <- approach1(all.grs[1:10], all.files[1:5])
>>>>>>>> })
>>>>>>>>
>>>>>>>> 160.9 Kb
>>>>>>>>       user  system elapsed
>>>>>>>>      1.109   0.001   1.113
>>>>>>>>
>>>>>>>> system.time({
>>>>>>>> mat1 <- approach1(all.grs[1:100], all.files[1:5])
>>>>>>>> }) # less than 4x slower than previous call
>>>>>>>>
>>>>>>>> 3 Mb
>>>>>>>>       user  system elapsed
>>>>>>>>      4.101   0.019   4.291
>>>>>>>>
>>>>>>>>
>>>>>>>> approach2 <- function(ranges, files) {
>>>>>>>>       gf <- GenomicFiles(rowData = ranges, files = files)
>>>>>>>>       MAPPER <- function(range, file, ....) {
>>>>>>>>           library(rtracklayer)
>>>>>>>>           rle <- import(file, which = range, as = "Rle", format =
>>>>>>>> "bw")[range]
>>>>>>>>           mean(rle)
>>>>>>>>       }
>>>>>>>>       sExp <- reduceByRange(gf, MAP = MAPPER, summarize = TRUE,
>>>>>>>> BPPARAM
>>>>>>>> =
>>>>>>>> SerialParam())
>>>>>>>>       sExp
>>>>>>>> }
>>>>>>>>
>>>>>>>> system.time({
>>>>>>>> mat2 <- approach2(all.grs[1:10], all.files[1:5])
>>>>>>>> })  # 8-9x slower than approach1
>>>>>>>>
>>>>>>>>      user  system elapsed
>>>>>>>>      9.349   0.280   9.581
>>>>>>>>
>>>>>>>> system.time({
>>>>>>>> mat2 <- approach2(all.grs[1:100], all.files[1:5])
>>>>>>>> })  # 9x slower than previous call, 20x slow than approach1 on same
>>>>>>>> input
>>>>>>>>
>>>>>>>>      user  system elapsed
>>>>>>>>     89.310   0.627  91.044
>>>>>>>>
>>>>>>>>           [[alternative HTML version deleted]]
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioc-devel at r-project.org mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>>>
>>>>>>
>>>>>>
-------------- next part --------------
library(GenomicFiles)
library(microbenchmark)

## 5 bam files ranging from 4e+08 to 8e+08 records:
fls <- BamFileList(c("exp_srx036692.bam", "exp_srx036695.bam",
                     "exp_srx036696.bam", "exp_srx036697.bam",
                     "exp_srx036692.bam")) ## re-use one file

## GRanges with 100 ranges and total width 1e+06:
starts <- sample(1:1e7, 100)
chr <- paste0(rep("chr", 100), 1:22)
grs <- GRanges(chr,  IRanges(starts, width=1e4))

## By hand:
FUN1 <- function(ranges, files) {
    ## equivalent to MAP step
    cvg <- lapply(files,
        FUN = function(file, range) {
            param = ScanBamParam(which=range)
            coverage(file, param=param)[range]
        }, range=ranges)
    ## equivalent to REDUCE step
    do.call(cbind, lapply(cvg, mean))
}

microbenchmark(FUN1(grs[1:10], fls), FUN1(grs[1:100], fls), times=10)
>> microbenchmark(FUN1(grs[1:10], fls), FUN1(grs[1:100], fls), times=10)
> Unit: seconds
>                   expr      min       lq     mean   median       uq      max
>   FUN1(grs[1:10], fls) 1.177858 1.190239 1.206127 1.201331 1.222298 1.256741
>  FUN1(grs[1:100], fls) 4.145503 4.163404 4.249619 4.208486 4.278463 4.533846
>  neval
>     10
>     10

## GenomicFiles:
MAP = function(range, file, ...) {
    param = ScanBamParam(which=range)
    coverage(file, param=param)[range]
}
REDUCE <- function(mapped, ...) do.call(cbind, lapply(mapped, mean))

FUN2 <- function(ranges, files) {
    reduceFiles(ranges, files, MAP, REDUCE, BPPARAM=SerialParam())
}

microbenchmark(FUN2(grs[1:10], fls), FUN2(grs[1:100], fls), times=10)
>> microbenchmark(FUN2(grs[1:10], fls), FUN2(grs[1:100], fls), times=10)
> Unit: seconds
>                   expr      min       lq     mean   median       uq      max
>   FUN2(grs[1:10], fls) 1.242767 1.251188 1.257531 1.253154 1.267655 1.275698
>  FUN2(grs[1:100], fls) 4.251010 4.340061 4.390290 4.361007 4.384064 4.676068
>  neval
>     10
>     10