[R] grouping

David L Carlson dcarlson at tamu.edu
Tue Apr 3 15:54:37 CEST 2012


Or just replace c(0, .333, .667, 1) with 

n <- 10
split(x, cut(x, quantile(x, prob= c(0, 1:(n-1)/n, 1)), include.lowest=TRUE))

where n is the number of groups you want.

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352



-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of R. Michael Weylandt
Sent: Tuesday, April 03, 2012 8:32 AM
To: Val
Cc: r-help at r-project.org
Subject: Re: [R] grouping

Use cut2 as I suggested and David demonstrated.

Michael

On Tue, Apr 3, 2012 at 9:31 AM, Val <valkremk at gmail.com> wrote:
> Thank you all (David, Michael, Giovanni)  for your prompt response.
>
> First there was a typo error for the group mean it was 89.6 not 87.
>
> For a small data set and few groupings I can use  prob=c(0, .333, .66 ,1)
to
> group in to three groups in this case. However,  if I want to extend the
> number of groupings say 10 or 15 then do I have to figure it out the
>   split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))
>
> Is there a short cut for that?
>
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
> <michael.weylandt at gmail.com> wrote:
>>
>> Ignoring the fact your desired answers are wrong, I'd split the
>> separating part and the group means parts into three steps:
>>
>> i) quantile() can help you get the split points,
>> ii)  findInterval() can assign each y to a group
>> iii) then ave() or tapply() will do group-wise means
>>
>> Something like:
>>
>> y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a "c"
here.
>> ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
>> tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)
>>
>> You could also use cut2 from the Hmisc package to combine findInterval
>> and quantile into a single step.
>>
>> Depending on your desired output.
>>
>> Hope that helps,
>> Michael
>>
>> On Tue, Apr 3, 2012 at 8:47 AM, Val <valkremk at gmail.com> wrote:
>> > Hi all,
>> >
>> > Assume that I have the following 10 data points.
>> >  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
>> >
>> > sort x  and get the following
>> >  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
>> >
>> > I want to  group the sorted  data point (y)  into  equal number of
>> > observation per group. In this case there will be three groups.  The
>> > first
>> > two groups  will have three observation  and the third will have four
>> > observations
>> >
>> > group 1  = 34, 45, 46
>> > group 2  = 66, 78, 125
>> > group 3  = 193, 209, 242,297
>> >
>> > Finally I want to calculate the group mean
>> >
>> > group 1  =  42
>> > group 2  =  87
>> > group 3  =  234
>> >
>> > Can anyone help me out?
>> >
>> > In SAS I used to do it using proc rank.
>> >
>> > thanks in advance
>> >
>> > Val
>> >
>> >        [[alternative HTML version deleted]]
>>
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list