[R] Add sequence numbers to lines with the same ID: How can this be accomplished?
Rolf Turner
r.turner at auckland.ac.nz
Sun Oct 25 05:05:47 CET 2015
On 25/10/15 12:33, Bert Gunter wrote:
> Rolf's solution works for the situation where all duplicated values
> are contiguous, which may be what you need. However, I wondered how it
> could be done if this were not the case. Below is an answer. It is not
> as efficient or elegant as Rolf's solution for the contiguous case I
> think; maybe someone will come up with something better. But I think
> it works. Here's an example with code:
>
>> w <- c(1:5,3,1,2,7,8,5,5,5,2,3)
>> w
> [1] 1 2 3 4 5 3 1 2 7 8 5 5 5 2 3
>> d <- 0+duplicated(w)
>> for(x in unique(w)){
> + i <- w==x
> + d[i]<-1+ cumsum(d[i])
> +
> + }
>> d
> [1] 1 1 1 1 1 2 2 2 1 1 2 3 4 3 3
>
> As always, corrections and/or improvements welcome.
How about:
o <- order(w)
d <- unlist(lapply(rle(w[o])$lengths,seq_len))[order(o)]
Works for the given example. :-)
cheers,
Rolf
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
> On Sat, Oct 24, 2015 at 4:02 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
>> On 25/10/15 11:28, John Sorkin wrote:
>>>
>>> I have a file that has (1) Line numbers, (2) IDs. A given ID number can
>>> appear in more than one row. For each row with a repeated ID, I want to add
>>> a number that gives the sequence number of the repeated ID number. The R
>>> code below demonstrates what I want to have, without any attempt to produce
>>> the result, as I have no idea how to accomplish my goal.
>>>
>>>
>>> line <- c(1,2,3,4,5,6,7,8,9,10)
>>> ID<- c(1,1,2,3,4,5,6,7,8,8)
>>> cat("Note lines 1 and 2 both contain ID 1; lines 9 and 10 both contain ID
>>> 8")
>>> cbind(line,ID)
>>> Seq <- c(1,2,1,1,1,1,1,1,1,2)
>>> cat("Sequence numbers within ID added to the data")
>>> cbind(line,ID,Seq)
>>
>>
>> I *think* that
>>
>> unlist(lapply(rle(ID)$lengths,seq_len))
>>
>> gives what you want. At least it does for the given example.
More information about the R-help
mailing list