[R] Indexing Grouped Data

Wed Jun 13 17:54:38 CEST 2012

df1$ind <- ave(integer(nrow(df1)), df1$id, FUN=seq_along)

There are faster ways to do this if you know that id is sorted.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Peter Maclean
> Sent: Wednesday, June 13, 2012 7:15 AM
> To: R mailing list
> Subject: Re: [R] Indexing Grouped Data
> 
> I need help in indexing grouped data. In this excample (df1 data), the first child had a
> first immunization at age 2. The second child had the first, second and third immunization
> at age 5,10, and 12, the third child had first and second immunization at age 4 and 6 and
> the fourth child had the first immunization at age 2. I have df1 and I need to create df2
> with and "ind' variable that indicate if the immunization is first, second or third. Note that
> the data is not balanced but is sorted such that the fisrt observation (of an individual) is
> the first immunization.
> 
> 
> > df1 <- data.frame(id = c(1,2,2,2,3,3,4), age = c(4,5,10, 12, 4,6, 2), dose =
> c(1.8,1.8,1.6,1.2,1.8,1.6,1.8))
> >
> > df2 <- data.frame(id = c(1,2,2,2,3,3,4), age = c(4,5,10, 12, 4,6, 2), ind=c(1,1,2,3,1,2,1),
> dose = c(1.8,1.8,1.6,1.2,1.8,1.6,1.8))
> >
> > df1
>   id age dose
> 1  1   4  1.8
> 2  2   5  1.8
> 3  2  10  1.6
> 4  2  12  1.2
> 5  3   4  1.8
> 6  3   6  1.6
> 7  4   2  1.8
> > df2
>   id age ind dose
> 1  1   4   1  1.8
> 2  2   5   1  1.8
> 3  2  10   2  1.6
> 4  2  12   3  1.2
> 5  3   4   1  1.8
> 6  3   6   2  1.6
> 7  4   2   1  1.8
> >
> 
> 
> Peter Maclean
> Department of Economics
> UDSM
> 	[[alternative HTML version deleted]]