AW: [Rd] Proposal: Generalizing unique() and duplicated()

Prof Brian Ripley ripley@stats.ox.ac.uk
Tue, 6 Feb 2001 12:49:28 +0000 (GMT)


Method dispatch is far from free (it is quite slow). Do we want to
encumber unique() (a fast internal function) in this way?

There are better ways to do this if one is going to use C code:
converting to character and comparing long strings are both expensive,


On Tue, 6 Feb 2001, Kaspar Pflugshaupt wrote:

> On Tuesday 06 February 2001 12:36, Dr. Jens Oehlschlägel wrote:
> > I like the idea. Why don't you call duplicated.matrix() directly in
> > unique.matrix() and duplicated.data.frame() in unique.data.frame() ?
> >
> > Jens Oehlschlägel
>
> Good point. I guess I got carried away with using methods (having just gotten
> the hang of the concept). :-)
>
> Anyway, here's a corrected version:
>
> ----------------------------------------------------
>
> "unique.default" <- get("unique", pos="package:base")    # old version becomes
>                                                          # default behaviour
> "unique" <- function(object, ...)
> {
>    if (data.class(object)=="matrix")
>        return(unique.matrix(object, ...))
>    else
>        UseMethod("unique")      # doesn't seem to work for matrices, hence
> }                               # the condition
>                          
>
>
> "duplicated.default" <- get("duplicated", pos="package:base")   
>
> "duplicated" <- function(object, ...)
> {
>    if (data.class(object)=="matrix")
>        return(duplicated.matrix(object, ...))
>    else
>        UseMethod("duplicated")  
> }
>
>
> "duplicated.matrix" <-
>   function(mat, MARGIN=1)    # defaulting to work on rows
> {
>   strvect <- drop(apply(mat, MARGIN, function(x) paste(x, collapse = "\r")))
>   return(duplicated(strvect))
> }
>
>
> "unique.matrix" <-
>   function(mat, MARGIN=1)    # defaulting to work on rows
> {
>   dup <- duplicated.matrix(mat, MARGIN)
>   return(if (MARGIN==1) mat[!dup,] else mat[,!dup])
> }
>
>
> "duplicated.data.frame" <-
>   function(df, MARGIN=1)
> {
>   strvect <- drop(apply(as.matrix(df), MARGIN, function(x) paste(x, collapse
> = "\r")))
>   duplicated(strvect)
> }
>
>
> "unique.data.frame" <-
>   function(df, MARGIN=1)
> {
>   dup <- duplicated.data.frame(df, MARGIN)
>   return(if (MARGIN==1) df[!dup,] else df[,!dup])
> }
>
> ----------------------------------------------------
>
> Cheers
>
> Kaspar Pflugshaupt
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._