[R] Identifying duplicate rows?
Petr Savicky
savicky at cs.cas.cz
Mon Sep 10 21:06:51 CEST 2012
On Mon, Sep 10, 2012 at 11:23:42AM -0700, kborgmann wrote:
> Hi,
> I am trying to identify duplicate values in a column in a date frame. The
> duplicated function identifies the duplicate rows in the data frame but it
> only does this for the second record, not both records. Is there a way to
> mark both rows in the data frame as TRUE?
> dfA$dups<-duplicated(dfA$Value)
> dfA
> Site State Value dups
> 929 VA 73 FALSE
> 929 VA 73 TRUE
> 930 VA 76 FALSE
> 930 VA 76 TRUE
> 931 VA 74 FALSE
> 932 VA 75 FALSE
>
> But I would like this
> Site State Value dups
> 929 VA 73 TRUE
> 929 VA 73 TRUE
> 930 VA 76 TRUE
> 930 VA 76 TRUE
> 931 VA 74 FALSE
> 932 VA 75 FALSE
Hi.
Try the following.
dfA <- cbind(State="VA", data.frame(Value=c(73, 73, 76, 76, 74, 75)))
dfA$dups <- duplicated(dfA$Value) | duplicated(dfA$Value, fromLast=TRUE)
dfA
State Value dups
1 VA 73 TRUE
2 VA 73 TRUE
3 VA 76 TRUE
4 VA 76 TRUE
5 VA 74 FALSE
6 VA 75 FALSE
Hope this helps.
Petr Savicky.
More information about the R-help
mailing list