[R] RE : Create sequence for dataset
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sun Nov 21 23:57:49 CET 2004
ssim at lic.co.nz writes:
> Dear members,
>
> I want to create a sequence of numbers for the multiple records of
> individual animal in my dataset. The SAS code below will do the trick, but
> I want to learn to do it in R. Can anyone help ?
>
> data ht&ssn;
> set ht&ssn;
> by anml_key;
> if first.anml_key then do;
> seq_ht_rslt=0;
> end;
> seq_ht_rslt+1;
>
> Thanks in advance.
Whoa. Who just said that SAS data step code was clearer than R? Quite
a bit of implicit knowledge in that one.
Here's one way (someone please think up a better name for ave()...):
> x <- numeric(nrow(airquality))
> ave(x, airquality$Month, FUN=function(z)seq(along=z))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5
[37] 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
[55] 24 25 26 27 28 29 30 1 2 3 4 5 6 7 8 9 10 11
[73] 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
[91] 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
[109] 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3
[127] 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
[145] 22 23 24 25 26 27 28 29 30
or, same basic idea but a little less cryptic:
> tb <- table(airquality$Month)
> l <- lapply(tb, function(x)seq(length=x))
> unsplit(l, airquality$Month)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5
(etc.)
or, brute force and ignorance:
> x <- numeric(nrow(airquality))
> for (i in unique(airquality$Month)) {
+ ix <- airquality$Month == i
+ x[ix] <- seq(along=x[ix])
+ }
> x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5
....
or, going to the opposite extreme (Gabor et al. are going to try and
beat me on this...):
> seq.factor <- function(f) ave(rep(1,length(f)),f,FUN=cumsum)
> seq(as.factor(airquality$Month))
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5
....
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list