[R] grouping
R. Michael Weylandt
michael.weylandt at gmail.com
Tue Apr 3 15:32:06 CEST 2012
Use cut2 as I suggested and David demonstrated.
Michael
On Tue, Apr 3, 2012 at 9:31 AM, Val <valkremk at gmail.com> wrote:
> Thank you all (David, Michael, Giovanni) for your prompt response.
>
> First there was a typo error for the group mean it was 89.6 not 87.
>
> For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) to
> group in to three groups in this case. However, if I want to extend the
> number of groupings say 10 or 15 then do I have to figure it out the
> split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))
>
> Is there a short cut for that?
>
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
> <michael.weylandt at gmail.com> wrote:
>>
>> Ignoring the fact your desired answers are wrong, I'd split the
>> separating part and the group means parts into three steps:
>>
>> i) quantile() can help you get the split points,
>> ii) findInterval() can assign each y to a group
>> iii) then ave() or tapply() will do group-wise means
>>
>> Something like:
>>
>> y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a "c" here.
>> ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
>> tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)
>>
>> You could also use cut2 from the Hmisc package to combine findInterval
>> and quantile into a single step.
>>
>> Depending on your desired output.
>>
>> Hope that helps,
>> Michael
>>
>> On Tue, Apr 3, 2012 at 8:47 AM, Val <valkremk at gmail.com> wrote:
>> > Hi all,
>> >
>> > Assume that I have the following 10 data points.
>> > x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
>> >
>> > sort x and get the following
>> > y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297)
>> >
>> > I want to group the sorted data point (y) into equal number of
>> > observation per group. In this case there will be three groups. The
>> > first
>> > two groups will have three observation and the third will have four
>> > observations
>> >
>> > group 1 = 34, 45, 46
>> > group 2 = 66, 78, 125
>> > group 3 = 193, 209, 242,297
>> >
>> > Finally I want to calculate the group mean
>> >
>> > group 1 = 42
>> > group 2 = 87
>> > group 3 = 234
>> >
>> > Can anyone help me out?
>> >
>> > In SAS I used to do it using proc rank.
>> >
>> > thanks in advance
>> >
>> > Val
>> >
>> > [[alternative HTML version deleted]]
>>
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list