[Rd] Why change data type when dropping to one-dimension?

Sat May 30 00:50:31 CEST 2009

On Fri, 29 May 2009, Stavros Macrakis wrote:

> This is another example of the general preference of the designers of R for
> convenience over consistency.
>
> In my opinion, this is a design flaw even for non-programmers, because I
> find that inconsistencies make the system harder to learn.  Yes, the naive
> user may stumble over the difference between m[[1,1]] and m[1,1] a few times
> before getting it, but once he or she understands the principle, it is
> general.

I was on your side of this argument the first time it came up, but ended up being convinced the other way.

In contrast to sample(n) or the non-standard evaluation of weights= and subset= arguments to modelling functions, or various other conveniences that I think we are stuck with despite them being a bad idea, I think dropping dimensions is useful.

      -thomas

>                 -s
>
> On Fri, May 29, 2009 at 5:33 PM, Jason Vertrees <jv at cs.dartmouth.edu> wrote:
>
>> Hello,
>>
>> First, let me say I'm an avid fan of R--it's incredibly powerful and I
>> use it all the time.  I appreciate all the hard work that the many
>> developers have undergone.
>>
>> My question is: why does the paradigm of changing the type of a 1D
>> return value to an unlisted array exist?  This introduces boundary
>> conditions where none need exist, thus making the coding harder and
>> confusing.
>>
>> For example, consider:
>> > d = data.frame(a=rnorm(10), b=rnorm(10));
>> > typeof(d);                  # OK;
>> > typeof(d[,1]);              # Unexpected;
>> > typeof(d[,1,drop=F]);       # Oh, now I see.
>>
>> This is indeed documented in the R Language specification, but why is it
>> there in the first place?  It doesn't make sense to the average
>> programmer to change the return type based on dimension.
>>
>> Here it is again in 'sapply':
>> > sapply
>> > function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
>> > {
>> >     [...snip...]
>> >        if (common.len == 1)
>> >            unlist(answer, recursive = FALSE)
>> >        else if (common.len > 1)
>> >            array(unlist(answer, recursive = FALSE),
>> >                     dim = c(common.len,
>> >                length(X)), dimnames = if (!(is.null(n1 <-
>> >                     names(answer[[1]])) &
>> >                is.null(n2 <- names(answer))))
>> >                list(n1, n2))
>> >     [...snip...]
>> >  }
>>
>> So, in 'sapply', if your return value is one-dimensional be careful,
>> because the return type will not the be same as if it were otherwise.
>>
>> Is this legacy or a valid, rational design decision which I'm not yet a
>> sophisticated enough R coder to enjoy?
>>
>> Thanks,
>>
>> -- Jason
>>
>> --
>>
>> Jason Vertrees, PhD
>>
>> Dartmouth College : jv at cs.dartmouth.edu
>> Boston University : jasonv at bu.edu
>>
>> PyMOLWiki : http://www.pymolwiki.org/
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle