[Rd] Anomaly with unique and match
Petr Savicky
savicky at cs.cas.cz
Wed Mar 9 17:00:40 CET 2011
On Wed, Mar 09, 2011 at 08:48:10AM -0600, Terry Therneau wrote:
> I stumbled onto this working on an update to coxph. The last 6 lines
> below are the question, the rest create a test data set.
>
> tmt585% R
> R version 2.12.2 (2011-02-25)
> Copyright (C) 2011 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> # Lines of code from survival/tests/singtest.R
> > library(survival)
> Loading required package: splines
> > test1 <- data.frame(time= c(4, 3,1,1,2,2,3),
> + status=c(1,NA,1,0,1,1,0),
> + x= c(0, 2,1,1,1,0,0))
> >
> > temp <- rep(0:3, rep(7,4))
> >
> > stest <- data.frame(start = 10*temp,
> + stop = 10*temp + test1$time,
> + status = rep(test1$status,4),
> + x = c(test1$x+ 1:7, rep(test1$x,3)),
> + epoch = rep(1:4, rep(7,4)))
> >
> > fit1 <- coxph(Surv(start, stop, status) ~ x * factor(epoch), stest)
>
> ## New lines
> > temp1 <- fit1$linear.predictor
> > temp2 <- as.matrix(temp1)
> > match(temp1, unique(temp1))
> [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 8 8 8 6 6 6 9 9 9 6 6
> > match(temp2, unique(temp2))
> [1] 1 2 3 4 4 5 6 7 7 7 6 6 6 NA NA NA 6 6 6 8 8 8
> 6 6
>
> -----------------------
>
> I've solved it for my code by not calling match on a 1 column vector.
> In general, however, should I be using some other paradym for this "map
> to unique" operation? For example match(as.character(x),
> unique(as.character(x)) ?
Let me suggest an alternative, which is consistent with unique() on
numeric vectors and uses a transformation of the column using rank().
For example,
temp3 <- as.matrix(rank(temp1, ties.method="max"))
match(temp3, unique(temp3))
[1] 1 2 3 4 4 5 6 7 7 7 6 6 6 8 8 8 6 6 6 9 9 9 6 6
Can this be used in your code?
Petr Savicky.
More information about the R-devel
mailing list