[Rd] [R] Semantics of sequences in R

Mon Feb 23 12:27:57 CET 2009

On Mon, 23 Feb 2009 11:31:16 +0100
Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:

> Berwin A Turlach wrote:
> > On Mon, 23 Feb 2009 08:52:05 +0100
> > Wacek Kusnierczyk <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
[...]
> >> and you mean that sort.list not being applicable to lists is a)
> >> good design, and b) something that by noe means should be fixed,
> >> right? 
> >
> > I neither said nor meant this and I do not see how what I said
> > could be interpreted in such a way.  I was just commenting to
> > Stavros that the example he picked, hoping that it would not break
> > existing code, was actually a bad one which potentially will break
> > a lot (?) of existing code. 
> 
> would it, really?  if sort.list were, in addition to sorting atomic
> vectors (can-be-considered-lists), able to sort lists, how likely
> would this be to break old code?  

Presumably not.

> can you give one concrete example, and suggest how to estimate how
> much old code would involve the same issue?

Check out the svn source of R, run configure, do whatever change you
want to sort.list, "make", "make check FORCE=FORCE".  That should give
you an idea how much would break.  

Additionally, you could try to install all CRAN packages with your
modified version and see how many of them break when their
examples/demos/&c is run.  

AFAIK, Brian is doing something like this on his machine.  I am sure
that if you ask nicely he will share his scripts with you.

If this sounds too time consuming, you might just want to unpack the
sources and grep for "sort.list" on all .R files;  I am sure you know
how to use find and grep to do this.

> > Also, until reading Patrick Burns' "The R Inferno" I was not aware
> > of sort.list.  That function had not registered with me since I
> > hardly used it.  
> 
> which hints that "potentially will break a lot (?) of existing code"
> is a rather unlikely event.

Only for code that I wrote; other people's need and knowledge of R may
vary.

> > And I also have no need of calling sort() on lists.  For em a
> > lists is a flexible enough data structure such that defining a
> > sort() command for them makes no sense; it could only work in very
> > specific circumstances.
> >   
> 
> i don't understand the first part:  "flexible enough data structure
> such that defining a sort() command for them makes no sense" makes no
> sense.

lists are very flexible structure whose component must not be of equal
type.  So how do you want to compare components?  How to you compare a
vector of numbers to a vector of character strings?  Or a list of
lists?  

Or should the sorting be on the length of the components?  Or their
names?  Or should sort(myList) sort each component of myList?  But for
that case we have already lapply(myList, sort).

> as to "it could only work in very specific circumstances" -- no, it
> would work for any list whatsoever, provided the user has a correctly
> implemented comparator.  for example, i'd like to sort a list of
> vectors by the vectors' length -- is this a very exotic idea?

No, if that is what you want.  And I guess it is one way of sorting a
list.  The question is what should be the default way?  

> > BTW, as I mentioned once before, you might want to consider to lose
> > these chips on your shoulders.
> >   
> 
> berwin, it's been a tradition on this list to discourage people from
> commenting on the design and implementation of r whenever they think
> it's wrong.  

I am not aware of any such tradition and I subscribed to R-help on 15
April 1998.  

The point is rather that by commenting only one will not achieve much,
in particular if the comments look more like complaints and the same
comments are done again and again (along with dragging up previous
comments or comments received on previous comments).

R is open source.  Check out the svn version, fix what you consider
needs fixing, submit a patch, convince R core that the patch fixes a
real problem/is an improvement/does not break too much.  Then you have
a better chance in achieving something.  

Alternatively, if it turns out that something that bugs you cannot be
changed without breaking too much existing code, start from scratch
that with a better design.  Apparently the GAP project
(http://www.gap-system.org/) is doing something like this, as
someone closely associated with that project once told me.  While
developing a version of GAP they collect information on how to improve
the design, data structures &c; then, at some point, they start to
write the next version from scratch.

> >> scary!  it's much preferred to confuse new users.
> >
> > I usually learn a lot when I get confused about some issues/concept.
> > Confusion forces one to sit down, think deeply and, thus, gain some
> > understanding.  So I am not so much concerned with new users being
> > confused.  It is, of course, a problem if the new user never comes
> > out of his or her confusion.
> 
> the problem, is, r users have to learn lots [...]

Indeed, and I guess in this age of instant gratification that that is a
real bummer for new users.

Best,

	Berwin