[R] grouping
David Winsemius
dwinsemius at comcast.net
Tue Apr 3 15:10:53 CEST 2012
On Apr 3, 2012, at 8:47 AM, Val wrote:
> Hi all,
>
> Assume that I have the following 10 data points.
> x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
>
> sort x and get the following
> y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297)
The methods below do not require a sorting step.
>
> I want to group the sorted data point (y) into equal number of
> observation per group. In this case there will be three groups. The
> first
> two groups will have three observation and the third will have four
> observations
>
> group 1 = 34, 45, 46
> group 2 = 66, 78, 125
> group 3 = 193, 209, 242,297
>
> Finally I want to calculate the group mean
>
> group 1 = 42
> group 2 = 87
> group 3 = 234
I hope those weren't answers from SAS.
>
> Can anyone help me out?
>
I usually do this with Hmisc::cut2 since it has a `g = <n>` parameter
that auto-magically calls the quantile splitting criterion but this is
done in base R.
split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
include.lowest=TRUE) )
$`[36,65.9]`
[1] 36 45 46
$`(65.9,189]`
[1] 66 78 125
$`(189,297]`
[1] 193 209 242 297
> lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
include.lowest=TRUE) ), mean)
$`[36,65.9]`
[1] 42.33333
$`(65.9,189]`
[1] 89.66667
$`(189,297]`
[1] 235.25
Or to get a table instead of a list:
> tapply( x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
include.lowest=TRUE) , mean)
[36,65.9] (65.9,189] (189,297]
42.33333 89.66667 235.25000
> In SAS I used to do it using proc rank.
?quantile isn't equivalent to Proc Rank but it will provide a useful
basis for splitting or tabling functions.
>
> thanks in advance
>
> Val
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list