[R] grouping

R. Michael Weylandt michael.weylandt at gmail.com
Tue Apr 3 15:13:56 CEST 2012


Ignoring the fact your desired answers are wrong, I'd split the
separating part and the group means parts into three steps:

i) quantile() can help you get the split points,
ii)  findInterval() can assign each y to a group
iii) then ave() or tapply() will do group-wise means

Something like:

y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a "c" here.
ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

You could also use cut2 from the Hmisc package to combine findInterval
and quantile into a single step.

Depending on your desired output.

Hope that helps,
Michael

On Tue, Apr 3, 2012 at 8:47 AM, Val <valkremk at gmail.com> wrote:
> Hi all,
>
> Assume that I have the following 10 data points.
>  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
>
> sort x  and get the following
>  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
>
> I want to  group the sorted  data point (y)  into  equal number of
> observation per group. In this case there will be three groups.  The first
> two groups  will have three observation  and the third will have four
> observations
>
> group 1  = 34, 45, 46
> group 2  = 66, 78, 125
> group 3  = 193, 209, 242,297
>
> Finally I want to calculate the group mean
>
> group 1  =  42
> group 2  =  87
> group 3  =  234
>
> Can anyone help me out?
>
> In SAS I used to do it using proc rank.
>
> thanks in advance
>
> Val
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list