[Bioc-devel] sapply and vapply

Laurent Gatto |@urent@g@tto @end|ng |rom uc|ouv@|n@be
Wed Aug 12 21:24:18 CEST 2020


Thank you for these instructive, although somewhat disheartening clarifications.

Laurent

________________________________________
From: Henrik Bengtsson <henrik.bengtsson using gmail.com>
Sent: 12 August 2020 19:01
To: Laurent Gatto
Cc: bioc-devel using r-project.org
Subject: Re: [Bioc-devel] sapply and vapply

FWIW,

> sapply(X, length) - always numeric(1) (integer(1) or double(1) for vectors of more than 2^31 - 1 elements)

Actually, the length of length(x) may not be 1L, e.g.

> x <- Formula::Formula(~ x)
> length(x)
[1] 0 1

>From help("length", package = "base"):

"Warning: Package authors have written methods that return a result of
length other than one (Formula) and that return a vector of type
double (Matrix), even with non-integer values (earlier versions of
sets). Where a single double value is returned that can be represented
as an integer it is returned as a length-one integer vector."

I/we recently learned this the hard way
(https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FHenrikBengtsson%2Ffuture%2Fissues%2F395&data=02%7C01%7Claurent.gatto%40uclouvain.be%7Ccab202d663f644c17f7208d83ee16021%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637328485000138390&sdata=ihtVlLMDsNv7NdJ7H8ibwg377zBkvNvRl3GRlJwXD9Q%3D&reserved=0).  It's rather
unfortunate that not even length() is strictly defined here, I'd say.
I think we could move away from this if lengths() would be a generic
so lengths(x) could be used above.  But that's a discussion for
R-devel.

/Henrik


On Wed, Aug 12, 2020 at 9:33 AM Laurent Gatto
<laurent.gatto using uclouvain.be> wrote:
>
> Dear all,
>
> I have a quick question regarding the usage of vapply and sapply. The former is recommended to insure that the output is always a vector of a specific type. For example:
>
> > df1 <- data.frame(x = 1:3, y = LETTERS[1:3])     ## OK test
> > df2 <- data.frame(x = 1:3, y = Sys.time() + 1:3) ## Not OK test
> > sapply(df1, class) ## vector of chars, OK
>           x           y
>   "integer" "character"
> > sapply(df2, class) ## ouch, not a vector
> $x
> [1] "integer"
>
> $y
> [1] "POSIXct" "POSIXt"
>
> > vapply(df2, class, character(1)) ## prefer an error rather than a list
> Error in vapply(df2, class, character(1)) : values must be length 1,
>  but FUN(X[[2]]) result is length 2
>
> There are cases, however, were FUN ensures that the output will be of length 1 and of a expected type. For example
>
> - sapply(X, all) - all() always returns logical(1)
> - sapply(X, length) - always numeric(1) (integer(1) or double(1) for vectors of more than 2^31 - 1 elements)
>
> or more generally
>
> - sapply(X, slot, "myslot") - slot() will always return a character(1) because @myslot is always character(1) (as defined by the class)
>
> Would you still recommend to use vapply() in such cases?
>
> Thank you in advance.
>
> Laurent
>
>
>
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel&data=02%7C01%7Claurent.gatto%40uclouvain.be%7Ccab202d663f644c17f7208d83ee16021%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637328485000138390&sdata=fYSshTFQLbFGUzIkDMLjZVHiru6zZ1wa3p2z31ERVM8%3D&reserved=0



More information about the Bioc-devel mailing list