[R] sequence number for 'long format'

William Dunlap wdunlap at tibco.com
Fri May 1 21:04:54 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of David Freedman
> Sent: Friday, May 01, 2009 11:52 AM
> To: r-help at r-project.org
> Subject: [R] sequence number for 'long format'
> 
> 
> Dear R-help list,
> 
> I've got a data set in long format - each subject can have 
> several (varying
> in number) measurements, with each record representing one 
> measurement.  I
> want to assign a sequence number to each measurement, 
> starting at 1 for a
> person's first measurement.  I can do this with the by 
> function, but there
> must be an easier way.  
> 
> Here's my code - id is id number, age is the age of the 
> person, and seq is
> the sequence variable that I've created.  Thanks very much 
> for the help.
> 
> david freedman, atlanta
> 
> ds=data.frame(list(id = c(1L, 1L, 1L, 1L, 8L, 8L, 16L, 16L, 16L,
> 16L, 16L, 19L, 32L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L, 64L,
> 79L, 79L, 80L, 80L, 80L, 80L, 85L, 86L, 96L, 96L, 96L, 103L,
> 103L, 106L, 106L, 106L, 106L, 106L, 106L, 106L, 140L, 140L, 144L,
> 144L, 144L, 144L, 144L, 144L, 144L, 146L, 146L, 146L, 146L, 160L,
> 160L, 160L, 160L, 160L, 160L, 164L, 164L, 176L, 176L, 176L, 176L,
> 176L, 176L, 176L, 176L, 181L, 190L, 192L, 192L, 192L, 192L, 192L,
> 192L, 197L, 197L, 197L, 224L, 224L, 224L, 229L, 232L, 232L, 232L,
> 232L, 232L, 232L, 232L, 249L, 249L), age = c(6.6054794521, 
> 9.301369863,
> 22.638356164, 31.961670089, 17.15890411, 25.106091718, 8.197260274,
> 11.295890411, 14.191780822, 22.43394935, 28.6, 6.6794520548,
> 10.824657534, 10.479452055, 13.432876712, 15.408219178, 17.643835616,
> 19.268493151, 22.624657534, 26.139726027, 35.493497604, 37.6,
> 15.895890411, 23.351129363, 13.810958904, 16.783561644, 17.95890411,
> 22.430136986, 12.021902806, 14.904859685, 7.4219178082, 10.060273973,
> 15.802739726, 17.328767123, 31.028062971, 8.3945205479, 10.350684932,
> 13.783561644, 17.843835616, 21.816438356, 27.901437372, 34.3,
> 10.517808219, 18.18630137, 11.378082192, 14.794520548, 16.77260274,
> 23.101369863, 27.912328767, 34.316221766, 40.2, 8.6054794521,
> 11.561643836, 14.863013699, 17.835616438, 8.0219178082, 9, 
> 9.9726027397,
> 10.690410959, 13.032876712, 30.138261465, 7.0602739726, 10.438356164,
> 8.9232876712, 9.9589041096, 10.915068493, 12.263013699, 14.257534247,
> 17.326027397, 18.454794521, 21.334246575, 45.190965092, 8.5643835616,
> 12.197260274, 15.405479452, 17.106849315, 27.843835616, 34.417522245,
> 39.9, 6.7890410959, 10.21369863, 15.857534247, 10.147945205,
> 13.473972603, 36.06844627, 17.331506849, 14.980821918, 15.939726027,
> 16.939726027, 17.619178082, 18.698630137, 37.084188912, 43.3,
> 7.7068493151, 10.726027397)))
> 
> head(ds,10)
> x=with(ds,by(ds,list(id),FUN=function(dc)1:length(dc$age))); x[1:20];
> ds$seq=unlist(x); head(ds,20)


If your data is sorted so that identical id values are always contiguous
you can replace the by() with
     sequence(rle(ds$id)$lengths)


> View this message in context: 
> http://www.nabble.com/sequence-number-for-%27long-format%27-tp
23338043p23338043.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 




More information about the R-help mailing list