[R] Odp: Remove duplicated rows

Fri Apr 23 12:00:14 CEST 2010

Hi
r-help-bounces at r-project.org napsal dne 23.04.2010 04:05:00:

> 
> Hi all,
> 
> I have a dataset similar to the following
> 
> Name   Date   Value
> A   1/01/2000   4
> A   2/01/2000   4
> A   3/01/2000   5
> A   4/01/2000   4
> A   5/01/2000   1
> B   6/01/2000   2
> B   7/01/2000   1
> B   8/01/2000   1
> 
> I would like R to remove duplicates based on column 1 and 3 only. In
> addition, I would like R to remove duplicates based on the underlying 
and
> overlying row only. For example, for A, I would like to remove row 2 
only
> and keep row 1, 3 and 4.

Hm. Strange. You want to keep lines 1,3 an 4. for A. What about line 5? 
Why do you want to keep line 1 and 4 which have A an 4 in both columns?

test=read.table("clipboard", header=T)
test[!duplicated(paste(test[,1], test[,3])),]
  Name      Date Value
1    A 1/01/2000     4
3    A 3/01/2000     5
5    A 5/01/2000     1
6    B 6/01/2000     2
7    B 7/01/2000     1

Gives you unique values, however I am not sure if it is what you want.

Regards
Petr

> 
> I have tried: unique() and replicated(), but I do not have much success. 
I
> have also tried: dataset<-c(1,diff(dataset)!=0), but I don't know how to
> apply it to this multi-column situation.
> 
> Any help would be greatly appreciated.
> 
> Thanks in advance,
> Chris
> -- 
> View this message in context: 
http://r.789695.n4.nabble.com/Remove-duplicated-
> rows-tp2023065p2023065.html
> Sent from the R help mailing list archive at Nabble.com.
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.