[Rd] suggesting a new feature for unique()
Liaw, Andy
andy_liaw at merck.com
Thu Aug 19 21:00:40 CEST 2004
Dear R-devel,
May I suggest that a new feature be added to a couple of unique() methods?
Sometimes it's useful to have the indices of the original data that the
unique elements come from, so that the original data can be recreated from
the unique()ed data. I suggest that an `index' argument be added for
unique. Below is a suggested patch against
R/src/library/base/R/duplicated.R:
*** R-devel/src/library/base/R/duplicated.R Tue Jul 20 12:45:52 2004
--- duplicated.R Thu Aug 19 14:50:38 2004
***************
*** 34,50 ****
## NB unique.default is used by factor to avoid unique.matrix,
## so it needs to handle some other cases.
! unique.default <- function(x, incomparables = FALSE, ...)
{
if(!is.logical(incomparables) || incomparables)
.NotYetUsed("incomparables != FALSE")
z <- .Internal(unique(x))
if(is.factor(x))
! factor(z, levels = seq(len=nlevels(x)), labels = levels(x),
! ordered = is.ordered(x))
else if(inherits(x, "POSIXct") || inherits(x, "Date"))
! structure(z, class=class(x))
! else z
}
unique.data.frame <- function(x, incomparables = FALSE, ...)
--- 34,51 ----
## NB unique.default is used by factor to avoid unique.matrix,
## so it needs to handle some other cases.
! unique.default <- function(x, incomparables = FALSE, index=FALSE, ...)
{
if(!is.logical(incomparables) || incomparables)
.NotYetUsed("incomparables != FALSE")
z <- .Internal(unique(x))
if(is.factor(x))
! z <- factor(z, levels = seq(len=nlevels(x)), labels = levels(x),
! ordered = is.ordered(x))
else if(inherits(x, "POSIXct") || inherits(x, "Date"))
! z <- structure(z, class=class(x))
! if (index) attr(z, "index") <- match(x, z)
! z
}
unique.data.frame <- function(x, incomparables = FALSE, ...)
***************
*** 55,61 ****
}
unique.matrix <- unique.array <-
! function(x, incomparables = FALSE , MARGIN = 1, ...)
{
if(!is.logical(incomparables) || incomparables)
.NotYetUsed("incomparables != FALSE")
--- 56,62 ----
}
unique.matrix <- unique.array <-
! function(x, incomparables = FALSE , MARGIN = 1, index=FALSE, ...)
{
if(!is.logical(incomparables) || incomparables)
.NotYetUsed("incomparables != FALSE")
***************
*** 66,70 ****
args <- rep(alist(a=), ndim)
names(args) <- NULL
args[[MARGIN]] <- !duplicated(as.vector(temp))
! do.call("[", c(list(x=x), args, list(drop=FALSE)))
}
--- 67,76 ----
args <- rep(alist(a=), ndim)
names(args) <- NULL
args[[MARGIN]] <- !duplicated(as.vector(temp))
! res <- do.call("[", c(list(x=x), args, list(drop=FALSE)))
! if (index) {
! resTemp <- apply(res, MARGIN, function(x) paste(x, collapse =
"\r"))
! attr(res, "index") <- match(temp, resTemp)
! }
! res
}
An example usage:
> x <- sample(5, 10, rep=T)
> x
[1] 4 2 5 3 2 3 4 2 2 3
> z <- unique(x, index=TRUE)
> z[attr(z, "index")] == x
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> x <- factor(x)
> z <- unique(x, index=TRUE)
> z
[1] 4 2 5 3
Levels: 2 3 4 5
> z[attr(z, "index")] == x
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[I have not tried adding the same thing for the unique.data.frame method,
but that shouldn't be too hard...]
Best,
Andy
Andy Liaw, PhD
Biometrics Research PO Box 2000, RY33-300
Merck Research Labs Rahway, NJ 07065
mailto:andy_liaw at merck.com 732-594-0820
More information about the R-devel
mailing list