[R] Duplicates and duplicated

Linlin Yan yanlinlin82 at gmail.com
Thu May 14 12:44:27 CEST 2009


The operator %in% is very good! And that can be simpler like this:
x %in% x[duplicated(x)]
 [1] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE

On Thu, May 14, 2009 at 4:43 PM, Andrej Blejec <Andrej.Blejec at nib.si> wrote:
> Try this
>
> x%in%x[which(y)]
>
> >From your example
>
>> x=c(1,2,3,4,4,5,6,7,8,9)
>> y=duplicated(x)
>> rbind(x,y)
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> x    1    2    3    4    4    5    6    7    8     9
> y    0    0    0    0    1    0    0    0    0     0
>> which(y)
> [1] 5
>> x[which(y)]
> [1] 4
>> x%in%x[which(y)]
>  [1] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE
>
> Andrej
>
> --
> Andrej Blejec
> National Institute of Biology
> Vecna pot 111 POB 141
> SI-1000 Ljubljana
> SLOVENIA
> e-mail: andrej.blejec at nib.si
> URL: http://ablejec.nib.si
> tel: + 386 (0)59 232 789
> fax: + 386 1 241 29 80
> --------------------------
> Organizer of
> Applied Statistics 2009 conference
> http://conferences.nib.si/AS2009
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of christiaan pauw
>> Sent: Thursday, May 14, 2009 8:17 AM
>> To: r-help at r-project.org
>> Subject: [R] Duplicates and duplicated
>>
>> Hi everybody.
>> I want to identify not only duplicate number but also the original
>> number
>> that has been duplicated.
>> Example:
>> x=c(1,2,3,4,4,5,6,7,8,9)
>> y=duplicated(x)
>> rbind(x,y)
>>
>> gives:
>>     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>> x    1    2    3    4    4    5    6    7    8     9
>> y    0    0    0    0    1    0    0    0    0     0
>>
>> i.e. the second 4 [,5] is a duplicate.
>>
>> What I want is the first and second 4. i.e [,4] and [,5] to be TRUE
>>
>>     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>> x    1    2    3    4    4    5    6    7    8     9
>> y    0    0    0    1    1    0    0    0    0     0
>>
>> I assume it can be done by sorting the vector and then checking is the
>> next
>> or the previous entry matches using
>> identical() . I am just unsure on how to write such a loop the logic
> of
>> which (I think) is as follows:
>>
>> sort x
>> for every value of x check if the next value is identical and return
>> TRUE
>> (or 1) if it is and FALSE (or 0) if it is not
>> AND
>> check is the previous value is identical and return TRUE (or 1) if it
>> is and
>> FALSE (or 0) if it is not
>>
>> Im i thinking correct and can some help to write such a function
>>
>> regards
>> Christiaan
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list