[Rd] RFC: sapply() limitation from vector to matrix, but not further

Mon Dec 27 22:55:53 CET 2010

Finally finding time to come back to this.
Remember that I've started the thread by proposing a version of sapply()
which does not just "stop" with making a matrix() from the lapply() result, but
instead --- only when the new argument ARRAY = TRUE is set ---
may return an array() of any (appropriate) order, in those cases where
the lapply() result elements all return an array of the same dim().

On Wed, Dec 1, 2010 at 19:51, Hadley Wickham <hadley at rice.edu> wrote:
>> A downside of that approach is that lapply(X,...) can
>> cause a lot of unneeded memory to be allocated (length(X)
>> SEXP's).  Those SEXP's would be tossed out by simplify() but
>> the peak memory usage would remain high.  sapply() can
>> be written to avoid the intermediate list structure.
>
> But the upside is reusable code that can be used in multiple places -
> what about the simplification code used by mapply and tapply? Why are
> there three different implementations of simplification?
>
> Hadley

I have now looked into using a version of what Hadley had proposed.
Note (to Bill's point) that the current implementation of sapply()
does go via lapply() and
that we have  vapply()  as a faster version of sapply()  with less
copying (hopefully).

Very unfortunately, vapply() .. which was only created 13 months ago,
has inherited the ``illogical''  behavior of  sapply()
in that it does not make up higher rank arrays if the single element
is already a matrix (say).
...
Consequently, we also need a patch to vapply(),
and I do wonder if we should not make "ARRAY=TRUE" the default there,
since with vapply() you specify a result value, and if you specify a
matrix, the total result should stack these matrices into an array of
rank 3, etc.
Looking at it, the patch is not so much work... notably if we don't
use a new argument but really let  FUN.VALUE determine what the result
should look like.

More comments are stil welcome...
Martin