# [R] grouping

R. Michael Weylandt michael.weylandt at gmail.com
Tue Apr 3 15:32:06 CEST 2012

```Use cut2 as I suggested and David demonstrated.

Michael

On Tue, Apr 3, 2012 at 9:31 AM, Val <valkremk at gmail.com> wrote:
> Thank you all (David, Michael, Giovanni)  for your prompt response.
>
> First there was a typo error for the group mean it was 89.6 not 87.
>
> For a small data set and few groupings I can use  prob=c(0, .333, .66 ,1) to
> group in to three groups in this case. However,  if I want to extend the
> number of groupings say 10 or 15 then do I have to figure it out the
>   split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))
>
> Is there a short cut for that?
>
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
> <michael.weylandt at gmail.com> wrote:
>>
>> Ignoring the fact your desired answers are wrong, I'd split the
>> separating part and the group means parts into three steps:
>>
>> ii)  findInterval() can assign each y to a group
>> iii) then ave() or tapply() will do group-wise means
>>
>> Something like:
>>
>> y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a "c" here.
>> ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
>> tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)
>>
>> You could also use cut2 from the Hmisc package to combine findInterval
>> and quantile into a single step.
>>
>> Depending on your desired output.
>>
>> Hope that helps,
>> Michael
>>
>> On Tue, Apr 3, 2012 at 8:47 AM, Val <valkremk at gmail.com> wrote:
>> > Hi all,
>> >
>> > Assume that I have the following 10 data points.
>> >  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
>> >
>> > sort x  and get the following
>> >  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
>> >
>> > I want to  group the sorted  data point (y)  into  equal number of
>> > observation per group. In this case there will be three groups.  The
>> > first
>> > two groups  will have three observation  and the third will have four
>> > observations
>> >
>> > group 1  = 34, 45, 46
>> > group 2  = 66, 78, 125
>> > group 3  = 193, 209, 242,297
>> >
>> > Finally I want to calculate the group mean
>> >
>> > group 1  =  42
>> > group 2  =  87
>> > group 3  =  234
>> >
>> > Can anyone help me out?
>> >
>> > In SAS I used to do it using proc rank.
>> >
>> >
>> > Val
>> >
>> >        [[alternative HTML version deleted]]
>>
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help