[R] Semantics of sequences in R
Duncan Murdoch
murdoch at stats.uwo.ca
Sun Feb 22 22:12:39 CET 2009
I think this was posted to the wrong list, so my followup is going to
R-devel.
On 22/02/2009 3:42 PM, Stavros Macrakis wrote:
> Inspired by the exchange between Rolf Turner and Wacek Kusnierczyk, I
> thought I'd clear up for myself the exact relationship among the
> various sequence concepts in R, including not only generic vectors
> (lists) and atomic vectors, but also pairlists, factor sequences,
> date/time sequences, and difftime sequences.
>
> I tabulated type of sequence vs. property to see if I could make sense
> of all this. The properties I looked at were the predicates
> is.{vector,list,pairlist}; whether various sequence operations (c,
> rev, unique, sort, rle) can be used on objects of the various types,
> and if relevant, whether they preserve the type of the input; and what
> the length of class( as.XXX (1:2) ) is.
>
> Here are the results (code to reproduce at end of email):
>
> numer list plist fact POSIXct difft
> is.vector TRUE TRUE FALSE FALSE FALSE FALSE
> is.list FALSE TRUE TRUE FALSE FALSE FALSE
> is.pairlist FALSE FALSE TRUE FALSE FALSE FALSE
> c_keep? TRUE TRUE FALSE FALSE TRUE FALSE
> rev_keep? TRUE TRUE FALSE TRUE TRUE TRUE
> unique_keep? TRUE TRUE "Err" TRUE TRUE FALSE
> sort_keep? TRUE "Err" "Err" TRUE TRUE TRUE
> rle_len 2 "Err" "Err" "Err" "Err" "Err"
>
> Alas, this tabulation, rather than clarifying things for me, just
> confused me more -- the diverse treatment of sequences by various
> operations is all rather bewildering.
But you are asking lots of different questions, so of course you should
get different answers. For example, the first three rows are behaving
exactly as documented. (Perhaps the functions should have been designed
differently, but a pretty-looking matrix isn't an argument for that.
Give some examples of how the documented behaviour is causing problems.)
I think some of the operations in the later rows are undocumented
(generally pairlists tend not to be documented, even if in some cases
they are supported), and it might make sense to make them more
consistent in the undocumented cases. But it may make more sense to
completely hide pairlists, for instance, and then several more of the
examples are behaving as documented. (BTW, your description of your
last row doesn't match what you did, as far as I can see.)
> Wouldn't it be easier to teach, learn, and use R if there were more
> consistency in the treatment of sequences?
Which ones in particular should change? What should they change to?
What will break when you do that?
> I understand that in
> long-running projects like S/R, there is an accumulation of
> contributions by a variety of authors, but perhaps the time has come
> for some cleanup at least for the base library?
Generally R core members are reluctant to take on work just because
someone else thinks it would be nice if they did. If you want to do
this, that's one thing, but if you are just saying that it would be nice
if someone else did it, then it's much less likely to get done. To get
someone else to do it you need to convince them that it's a valuable use
of their time, and I don't see that yet.
Duncan Murdoch
More information about the R-help
mailing list