[R] Keep ALL duplicate records

Pete Brecknock Peter.Brecknock at bp.com
Sun Oct 2 20:18:43 CEST 2011


Erik Svensson wrote:
> 
> Hello,
> In a data frame I want to identify ALL duplicate IDs in the example to be
> able to examine "OS" and "time".
> 
> (df<-data.frame(ID=c("userA", "userB", "userA", "userC"),
>   OS=c("Win","OSX","Win", "Win64"),
>   time=c("12:22","23:22","04:44","12:28")))
> 
>      ID    OS  time
> 1 userA   Win 12:22
> 2 userB   OSX 23:22
> 3 userA   Win 04:44
> 4 userC Win64 12:28
> 
> My desired output is that ALL records with the same IDs are found:
> 
> userA   Win 12:22
> userA   Win 04:44
> 
> preferably by returning logical values (TRUE FALSE TRUE FALSE)
> 
> Is there a simple way to do that?
> 
> [-- With duplicated(df$ID) the output will be
> [1] FALSE FALSE  TRUE FALSE 
> i.e. not all user A records are found
> 
> With unique(df$ID)
> [1] userA userB userC
> Levels: userA userB userC 
> i.e. one of each ID is found --]
> 
> Erik Svensson
> 


How about ...

# All records
ALL_RECORDS <- df[df$ID==df$ID[duplicated(df$ID)],]
print(ALL_RECORDS)

# Logical Records
TRUE_FALSE <- df$ID==df$ID[duplicated(df$ID)]
print(TRUE_FALSE)

HTH

Pete


--
View this message in context: http://r.789695.n4.nabble.com/Keep-ALL-duplicate-records-tp3865136p3865573.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list