[Rd] Row subsetting of data frames (PR#425)

Prof Brian Ripley Prof Brian Ripley <ripley@stats.ox.ac.uk>
Wed, 9 Feb 2000 18:04:06 +0000 (GMT)


> From: plummer@iarc.fr
> Date: Wed, 9 Feb 2000 18:38:12 +0100 (MET)
> To: r-devel@stat.math.ethz.ch
> Subject: [Rd] Row subsetting of data frames (PR#425)
> CC: R-bugs@biostat.ku.dk
> X-Loop: R-bugs@biostat.ku.dk
> 
> If you want to use row names to take a row subset of a data.frame then
> there is a bug when
> - One row has a name which is a completion of another row name
> - The shorter name comes after the longer one
> - You want to retrieve the row with the shorter name
> 
> An example:
> 
> R> x <- matrix(1:4, 2, 2, dimnames=list(c("abc","ab"), c("cde","cd")))
> R> x
>     cde cd
> abc   1  3
> ab    2  4
> R> x["ab",]                       #Works OK for matrices
> cde  cd 
>   2   4 
> R> y <- as.data.frame(x)
> R> y["ab",]                       #but not for a data frame
>     cde cd
> abc   1  3

The problem boils down to

>  pmatch("ab", c("abc", "ab"), duplicates.ok = T)
[1] 1

and the code expects 2, which is what S gives.  The description of

duplicates.ok: should duplicate matches be allowed?

     If there are multiple matches the result depends on the value of
     `duplicates.ok'. If this is false multiple matches will result in
     the value of `nomatch' being returned, and if it is true, the
     index of the first matching value will be returned.

is different from S: the argument in S refers to allowing
duplicates in x, so

> pmatch(rep("ab",3), c("abc", "ab"), duplicates.ok = T)
[1] 2 2 2
> pmatch(rep("ab",3), c("abc", "ab"), duplicates.ok = F)
[1]  2  1 NA

A quick fix is in [.data.frame, to give

        if (is.character(i))
            i <- sapply(i, function(x) match(x, rows))

but I think we should make pmatch S-compatible.

Brian

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._