[R] Two basic data manipulation questions (counting and aggregating)

Greg Snow Greg.Snow at intermountainmail.org
Fri Apr 13 16:38:57 CEST 2007


For the 1st question, if every record with the same id is together in a
block, then 

> mydf$counter <- unlist( tapply( mydf$type, mydf$id, rank, tie='first'
) ) 

Should give you your counter column

If the id's are not in blocks then try:
> mydf$counter <- ave(tmp$type, factor(tmp$id), FUN=function(x) rank(x,
tie='first'))

For the second question, look at the collapse argument to paste.  Using
paste with collapse='' should give you the scalar that you can use with
aggregate.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Julien Barnier
> Sent: Friday, April 13, 2007 5:08 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Two basic data manipulation questions (counting 
> and aggregating)
> 
> Dear R users,
> 
> I hav two basic data manipulations questions that I can't resolve.
> 
> My data is a data frame which look like the following :
> 
> id     type
> 10002  "7"
> 10061  "1"
> 10061  "1"
> 10061  "4"
> 10065  "7"
> 10114  "1"
> 10114  "1"
> 10114  "4"
> 10136  "7"
> 10136  "2"
> 10136  "2"
> 
> 
> First, I would like to create a "counter" variable which will 
> count the rank of each row inside each "id" level, ie something like :
> 
> id     type   counter
> 10002  "7"      1 
> 10061  "1"      1
> 10061  "1"      2
> 10061  "4"      3
> 10065  "7"      1
> 10114  "1"      1 
> 10114  "1"      2
> 10114  "4"      3
> 10136  "7"      1
> 10136  "2"      2
> 10136  "2"      3
> 
> Is there a straightforward way to do that, without using 
> several "for" loops ?
> 
> The second thing I would like to do is to aggregate the first 
> data.frame by concatenating the 'type' values for each 'id', 
> ie I'd like to obtain something like :
> 
> id     value
> 10002  7
> 10061  114
> 10065  7
> 10114  114
> 10136  722
> 
> I have tried the "aggregate" function, but it doesn't work 
> because the "paste" function doesn't return a scalar value. 
> Using tapply seems to work, but is not straightforward, and I 
> wanted to know if there is a simple way to do this.
> 
> Thanks in advance for any help.
> 
> --
> Julien
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list