[R] Binning or grouping data
Allan Engelhardt
allane at cybaea.com
Thu Jun 4 10:52:10 CEST 2009
You want cut and tapply, perhaps along these lines:
## Your data frame:
a <- data.frame(patient=1:6, charges=c(100,500,200,90,400,500),
age=c(0,3,5,7,10,16),
race=c("black","white","hispanic","asian","hispanic","black"))
## Add an age category:
a <- cbind(a, age_category=cut(a$age, breaks=c(-Inf,4,11,17)))
## Calculate average charges per age category and race
with(a, tapply(charges, list(age_category, race), mean))
# asian black hispanic white
# (-Inf,4] NA 100 NA 500
# (4,11] 90 NA 300 NA
# (11,17] NA 500 NA NA
Hope this helps.
Allan.
alamoboy wrote:
> Newbie here. Many apologies in advance for using the incorrect lingo. I'm
> new to statistics and VERY new to R.
>
> I'm attempting to "group" or "bin" data together in order to analyze them as
> a combined group rather than as discrete set. I'll provide a simple example
> of the data for illustrative purposes.
>
> Patient ID | Charges | Age | Race
> 1 | 100 | 0 | Black
> 2 | 500 | 3 | White
> 3 | 200 | 5 | Hispanic
> 4 | 90 | 7 | Asian
> 5 | 400 | 10 | Hispanic
> 6 | 500 | 16 | Black
>
> I'm trying to create three age categories--"0 to 4", "5 to 11" and "12 to
> 17"--and analyze their "Charges" by their "Race." How do I go abouts to
> doing this?
>
> Thanks for any assistance!
>
>
> Sam
>
>
More information about the R-help
mailing list