[Rd] RFC: Proposal to make NROW() and NCOL() slightly more general

Hervé Pagès hpages at fhcrc.org
Thu Feb 9 01:08:33 CET 2012


Hi Martin,

On 02/07/2012 08:32 AM, Martin Maechler wrote:
>>>>>> Martin Maechler<maechler at stat.math.ethz.ch>
>>>>>>      on Mon, 6 Feb 2012 15:35:36 +0100 writes:
>
>      >>  On Sat, Feb 4, 2012 at 10:38 AM, Martin Maechler
>      >>  <maechler at stat.math.ethz.ch>  wrote:
>      >>  >  The help has
>      >>  >
>      >>  >>  Description:
>      >>  >
>      >>  >>     'nrow' and 'ncol' return the number of rows or columns present in 'x'.
>      >>  >>     'NCOL' and 'NROW' do the same treating a vector as 1-column matrix.
>      >>  >
>      >>  >  and
>      >>  >
>      >>  >>     x: a vector, array or data frame
>      >>  >
>      >>  >  I'm proposing to extend these two convenience functions
>      >>  >  to also work ``correctly'' for generalized versions of matrices.
>      >>  >
>      >>  >
>      >>  >  The current implementation :
>      >>  >
>      >>  >  NROW<- function(x) if(is.array(x)||is.data.frame(x)) nrow(x) else length(x)
>      >>  >  NCOL<- function(x) if(is.array(x)&&  length(dim(x))>  1L || is.data.frame(x)) ncol(x) else 1L
>      >>  >
>      >>  >  only treats something as matrix when  is.array(.) is true,
>      >>  >  which is not the case, e.g., for multiprecision matrices from
>      >>  >  package 'gmp' or for matrices from packages SparseM, Matrix or similar.
>      >>  >
>      >>  >  Of course, all these packages could write methods for NROW, NCOL
>      >>  >  for their specific matrix class, but given that the current
>      >>  >  definition is so simple,
>      >>  >  I'd find it an unnecessary complication.
>      >>  >
>      >>  >  Rather I propose the following new version
>      >>  >
>      >>  >  NROW<- function(x) if(length(dim(x)) || is.data.frame(x)) nrow(x) else length(x)
>      >>  >  NCOL<- function(x) if(length(dim(x))>  1L || is.data.frame(x)) ncol(x) else 1L
>
>      >>  That makes me wonder about:
>
>      >>  DIM<- function(x) if (length(dim(x))>  1L) dim(x) else c(length(x), 1L)
>
>      >>  or maybe more efficiently:
>
>      >>  DIM<- function(x) {
>      >>  d<- dim(x)
>      >>  if (length(d)>  1L) dim(x) else c(length(x), 1L)
>      >>  }
>
>      >>  given that dim() is not always trivial to compute (e.g. for data
>      >>  frames it can be rather slow if you're doing it for hundreds of data
>      >>  frames)
>
>      >>  then NROW and NCOL could be exact equivalents to nrow and ncol.
>
>      >>  Hadley
>
>      >  Thank you, Hadley.
>      >  Indeed, your suggestion seems to make sense
>      >  {as far as it makes sense to have such simple functions to
>      >  exist in base at all, but as we already have NROW and NCOL ..}
>
>      >  So, I propose to adopt Hadley's  DIM() proposal, modified to
>
>      >  DIM<- function(x) if(length(d<- dim(x))) d else c(length(x), 1L)
>
>      >  and wait a day or so (or longer for reasons of vacation!) before
>      >  committing it, so the public can raise opinions.
>
> Actually, the above --- building NROW() and NCOL() ond DIM() is
> not quite correct:
>
>    NCOL<- function(x) DIM(x)[2L]
>
> will fail for   x<- array(1:3, 3)

But that's because in your modified version of DIM() you changed
Hadley's original proposed semantic. With the original semantic:

   DIM <- function(x) if(length(d <- dim(x)) >= 2L) d else c(length(x), 1L)

things work as expected on array(1:3, 3):

   > x <- array(1:3, 3)
   > DIM(x)
   [1] 3 1

(Note that using >= 2L instead of > 1L is more readable as it emphasizes
the fact that the returned vector will always have at least 2 elements.)

Cheers,
H.

>
> so I think I'll stick for now with the generalizations to NROW()
> and NCOL(), using
>
> NROW<- function(x) if(length(d<- dim(x)))      d[1L] else length(x)
> NCOL<- function(x) if(length(d<- dim(x))>  1L) d[2L] else 1L
>
> which incorporates Hadley's note that there are case where
> dim(.) is ``relatively expensive''.
> Note that the above are also (very slightly) more efficient than
> basing them on DIM(.).
>
> Martin
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list