[R-sig-Geo] Help to eliminate duplicated from data.frame but Special Problem

Jon Olav Skoien jon.skoien at jrc.ec.europa.eu
Wed Mar 9 15:52:40 CET 2011


Hi Gianni,

 From the example it seems like you want to check if value1 is 
duplicated, not Id:
 > my.df[!duplicated(my.df$value1),]
You can also remove duplicated rows based on the values of more than one 
column:
 > my.df[!duplicated(my.df[,c("Id","value1")]),]
Does any of these do what you want?

Cheers,
Jon


On 3/9/2011 3:42 PM, gianni lavaredo wrote:
> Dear Reseacher,
> i need to resolve the following problem. I wish to delete duplicate row from
> a data.frame but not all duplicate row:
>
>
> ex:
>
> my.df<- data.frame(Id=c(1,2,3,4,5,5,6,7,8,8,8,9),
> value1=c(10,20,30,40,50,50,60,70,80,80,81,90),
> value2=c(100,200,300,400,500,500,600,700,800,800,799,900))
>
>
>> my.df
>     Id value1 value2
> 1   1     10    100
> 2   2     20    200
> 3   3     30    300
> 4   4     40    400
> 5   5     50    500
> 6   5     50    500
> 7   6     60    600
> 8   7     70    700
> 9   8     80    800
> 10  8     80    800
> 11  8     81    799
> 12  9     90    900
>
>
> eliminate
>
>> my.df
>     Id value1 value2
> 1   1     10    100
> 2   2     20    200
> 3   3     30    300
> 4   4     40    400
> 5   5     50    500
> 7   6     60    600
> 8   7     70    700
> 9   8     80    800
> 11  8     81    799
> 12  9     90    900
>
> but if I use
>
> xx<-  my.df[!duplicated( my.df$Id), ]
>
> my result is
>
>> xx
>     Id value1 value2
> 1   1     10    100
> 2   2     20    200
> 3   3     30    300
> 4   4     40    400
> 5   5     50    500
> 7   6     60    600
> 8   7     70    700
> 9   8     80    800
> 12  9     90    900
>
>
> thanks in advance
> Gianni
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list