[R] Problems using unique function and !duplicated
JonC
jon_d_cooke at yahoo.co.uk
Mon Feb 28 16:51:17 CET 2011
Hi, I am trying to simultaneously remove duplicate variables from two or more
variables in a small R data.frame. I am trying to reproduce the SAS
statements from a Proc Sort with Nodupkey for those familiar with SAS.
Here's my example data :
test <- read.csv("test.csv", sep=",", as.is=TRUE)
> test
date var1 var2 num1 num2
1 28/01/11 a 1 213 71
2 28/01/11 b 1 141 47
3 28/01/11 c 2 867 289
4 29/01/11 a 2 234 78
5 29/01/11 b 2 666 222
6 29/01/11 c 2 912 304
7 30/01/11 a 3 417 139
8 30/01/11 b 3 108 36
9 30/01/11 c 2 288 96
I am trying to obtain the following, where duplicates of date AND var2 are
removed from the above data.frame.
date var1 var2 num1 num2
28/01/2011 a 1 213 71
28/01/2011 c 2 867 289
29/01/2011 a 2 234 78
30/01/2011 c 2 288 96
30/01/2011 a 3 417 139
If I use the !duplicated function with one variable everything works fine.
However I wish to remove duplicates of both Date and var2.
test[!duplicated(test$date),]
date var1 var2 num1 num2
1 0011-01-28 a 1 213 71
4 0011-01-29 a 2 234 78
7 0011-01-30 a 3 417 139
test2 <- test[!duplicated(test$date),!duplicated(test$var2),]
Error in `[.data.frame`(test, !duplicated(test$date),
!duplicated(test$var2), : undefined columns selected
I get an error ?
I got different errors when using the unique() function.
Can anybody solve this ?
Thanks in advance.
Jon
--
View this message in context: http://r.789695.n4.nabble.com/Problems-using-unique-function-and-duplicated-tp3328150p3328150.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list