[R] Odp: Fwd: duplicates
Petr PIKAL
petr.pikal at precheza.cz
Thu Jul 29 16:54:23 CEST 2010
Hi
rather complicated one liner assuming your data frame has name test
do.call(rbind,lapply(split(test,test$var1), function(x)
x[which.max(x[,"var2"]),]))
Here it is in 3 lines
test.s <- split(test,test$var1) # splits data frame
result <- lapply(test.s, function(x) x[which.max(x[,"var2"]),]) # chose
maximum value from var2 and selects corresponding row
do.call(rbind, result) # put evereything into one data frame again
There could be issues if you had NA values in var1 or var2
Regards
Petr
r-help-bounces at r-project.org napsal dne 29.07.2010 16:31:06:
>
>
> -- Eredeti ĂĽzenet --
> Feladó: Dévaványai Agamemnón
<devavanyai at citromail.hu>CĂmzett: r-
> hel at r-project.org, r-hel at r-project.orgElkĂĽldve: 2010. jĂşlius 29.
16:29Tárgy
> : duplicates
>
> Sorry!
> I try it again
>
> Dear R Users!
>
>
> I have a dataframe with duplicatecases. Var1 duplicated by var2.
>
>
>
> var1 var2 var3 var4 var5
> 1 4 500 1 2
> 1 3 200 2 5
> 1 8 125 1 9
> 2 2 120 2 52
> 2 6 22 1 20
> 2 9 400 1 22
> 3 1 100 2 8
> 3 2 200 5 40
> 4 8 20 1 60
>
> I want to delete duplicate ones from var1 which have low rank at var2,
and
> keep that case which has highest rank at var2. I would like to keep the
Whole
> row (with the other variables:
>
> var1 var2 var3 var4 var5
> 1 8 125 1 9
> 2 9 400 1 22
> 3 2 200 50 40
> 4 8 200 1 60
>
> Thanks Ag
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list