[R] Binning or grouping data

Allan Engelhardt allane at cybaea.com
Thu Jun 4 10:52:10 CEST 2009


You want cut and tapply, perhaps along these lines:

## Your data frame:
a <- data.frame(patient=1:6, charges=c(100,500,200,90,400,500), 
age=c(0,3,5,7,10,16), 
race=c("black","white","hispanic","asian","hispanic","black"))

## Add an age category:
a <- cbind(a, age_category=cut(a$age, breaks=c(-Inf,4,11,17)))

## Calculate average charges per age category and race
with(a, tapply(charges, list(age_category, race), mean))

#           asian black hispanic white
# (-Inf,4]     NA   100       NA   500
# (4,11]       90    NA      300    NA
# (11,17]      NA   500       NA    NA



Hope this helps.

Allan.

alamoboy wrote:
> Newbie here.  Many apologies in advance for using the incorrect lingo.  I'm
> new to statistics and VERY new to R.
>
> I'm attempting to "group" or "bin" data together in order to analyze them as
> a combined group rather than as discrete set.  I'll provide a simple example
> of the data for illustrative purposes.
>
> Patient ID  |  Charges   |    Age  |   Race
> 1              |  100          |    0     |   Black
> 2              |  500          |    3     |   White
> 3              |  200          |    5     |   Hispanic
> 4              |   90           |    7     |   Asian
> 5              |    400        |   10     |   Hispanic 
> 6              |    500        |   16     |   Black
>
> I'm trying to create three age categories--"0 to 4", "5 to 11" and "12 to
> 17"--and analyze their "Charges" by their "Race."  How do I go abouts to
> doing this?  
>
> Thanks for any assistance!
>
>
> Sam
>
>




More information about the R-help mailing list