[R] Identifying duplicate rows?

Petr Savicky savicky at cs.cas.cz
Mon Sep 10 21:06:51 CEST 2012


On Mon, Sep 10, 2012 at 11:23:42AM -0700, kborgmann wrote:
> Hi,
> I am trying to identify duplicate values in a column in a date frame.  The
> duplicated function identifies the duplicate rows in the data frame but it
> only does this for the second record, not both records. Is there a way to
> mark both rows in the data frame as TRUE? 
> dfA$dups<-duplicated(dfA$Value)
> dfA
> Site	State Value dups
> 929	VA	73 FALSE
> 929	VA	73  TRUE
> 930	VA	76 FALSE
> 930	VA	76 TRUE
> 931	VA	74 FALSE
> 932	VA	75 FALSE
> 
> But I would like this
> Site	State Value dups
> 929	VA	73 TRUE
> 929	VA	73  TRUE
> 930	VA	76 TRUE
> 930	VA	76 TRUE
> 931	VA	74 FALSE
> 932	VA	75 FALSE

Hi.

Try the following.

  dfA <- cbind(State="VA", data.frame(Value=c(73, 73, 76, 76, 74, 75)))
  dfA$dups <- duplicated(dfA$Value) | duplicated(dfA$Value, fromLast=TRUE)
  dfA

    State Value  dups
  1    VA    73  TRUE
  2    VA    73  TRUE
  3    VA    76  TRUE
  4    VA    76  TRUE
  5    VA    74 FALSE
  6    VA    75 FALSE

Hope this helps.

Petr Savicky.




More information about the R-help mailing list