[Rd] vector labels are not permuted properly in a call to sort() (R 2.1)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Oct 5 16:05:12 CEST 2005
The main problem is that R is inconsistent here. There are lots of
branches through the sort() code. Greg showed one. Here are four more
> sort(y, method="quick")
[,1] [,2]
A 1 5
B 2 6
C 3 7
D 4 8
> names(y) <- letters[1:8]
> sort(y)
h g f e d c b a
1 2 3 4 5 6 7 8
> sort(y, method="quick")
[,1] [,2]
A 1 5
B 2 6
C 3 7
D 4 8
attr(,"names")
[1] "h" "g" "f" "e" "d" "c" "b" "a"
> sort(y, partial=4)
[,1] [,2]
A 1 5
B 2 6
C 3 7
D 4 8
attr(,"names")
[1] "a" "b" "c" "d" "e" "f" "g" "h"
I believe Svr4 does keep names but does not allow names on matrices.
There are other problems: should sorting a time-series preserve the ts
properties (probably not, but it does). Should (S3 or S4) class
information be preserved (it seems inappropriate for a time series, for
example)?
The course of least resistance here is to always preserve attributes and
to document that we do so. Probably the most S-compliant solution is to
preserve only names (and sort them as now).
David James quotes the Blue Book, but note that S itself no longer follows
the principle stated there.
On Wed, 5 Oct 2005, Martin Maechler wrote:
>>>>>> "AndyL" == Liaw, Andy <andy_liaw at merck.com>
>>>>>> on Tue, 4 Oct 2005 13:51:11 -0400 writes:
>
> AndyL> The `problem' is that sort() does not doing anything special when given
> AndyL> a matrix: it only treat it as a vector. After sorting, it copies
> AndyL> attributes of the original input to the output. Since dimnames are
> AndyL> attributes, they get copied as is.
>
> exactly. Thanks Andy.
>
> And I think users would want this (copying of attributes) in
> many cases; in particular for user-created attributes
>
> ?sort really talks about sorting of vectors and factors;
> and it doesn't mention attributes explicitly at all
> {which should probably be improved}.
>
> One could wonder if R should keep the dim & dimnames
> attributes for arrays and matrices.
> S-plus (6.2) simply drops them {returning a bare unnames vector}
> and that seems pretty reasonable to me.
>
> At least the user would never make the wrong assumptions that
> Greg made about ``matrix sorting''.
>
>
> AndyL> Try:
>
> >> y <- matrix(8:1, 4, 2, dimnames=list(LETTERS[1:4], NULL))
> >> y
> AndyL> [,1] [,2]
> AndyL> A 8 4
> AndyL> B 7 3
> AndyL> C 6 2
> AndyL> D 5 1
> >> sort(y)
> AndyL> [,1] [,2]
> AndyL> A 1 5
> AndyL> B 2 6
> AndyL> C 3 7
> AndyL> D 4 8
>
> AndyL> Notice the row names stay the same. I'd argue that this is the correct
> AndyL> behavior.
>
> AndyL> Andy
>
>
> >> From: Greg Finak
> >>
> >> Not sure if this is the correct forum for this,
>
> yes, R-devel is the proper forum.
> {also since this is really a proposal for a change in R ...}
>
> >> but I've found what I
> >> would consider to be a potentially serious bug to the
> >> unsuspecting user.
> >> Given a numeric vector V with class labels in R, the following calls
> >>
> >> 1.
> >> > sort(as.matrix(V))
> >>
> >> and
> >>
> >> 2.
> >> >as.matrix(sort(V))
> >>
> >> produce different ouput. The vector is sorted properly in
> >> both cases,
> >> but only 2. produces the correct labeling of the vector. The call to
> >> 1. produces a vector with incorrect labels (not sorted).
> >>
> >> Code:
> >> >X<-c("A","B","C","D","E","F","G","H")
> >> >Y<-rev(1:8)
> >> >names(Y)<-X
> >> > Y
> >> A B C D E F G H
> >> 8 7 6 5 4 3 2 1
> >> > sort(as.matrix(Y))
> >> [,1]
> >> A 1
> >> B 2
> >> C 3
> >> D 4
> >> E 5
> >> F 6
> >> G 7
> >> H 8
> >> > as.matrix(sort(Y))
> >> [,1]
> >> H 1
> >> G 2
> >> F 3
> >> E 4
> >> D 5
> >> C 6
> >> B 7
> >> A 8
> >>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list