[R] Percentages for categorical data by group

Michael Conklin michael.conklin at markettools.com
Fri May 23 17:33:25 CEST 2008


tapply(example.data$responseVar,example.data$groupVar,function(x){prop.t
able(table(x))})

Michael Conklin

Chief Methodologist - Advanced Analytics

 

MarketTools, Inc.

6465 Wayzata Blvd. Suite 170

Minneapolis, MN 55426 

Tel: 952.417.4719 | Mobile:612.201.8978

Michael.Conklin at markettools.com

 

MarketTools(r)    http://www.markettools.com

 

This e-mail and any attachments may contain privileged, confidential or
proprietary information. If you are not the intended recipient, be aware
that any review, copying, or distribution of this e-mail or any
attachment is strictly prohibited. If you have received this e-mail in
error, please return it to the sender immediately, and permanently
delete the original and any copies from your system. Thank you for your
cooperation.

 


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Economics Guy
Sent: Friday, May 23, 2008 9:52 AM
To: r-help at stat.math.ethz.ch
Subject: [R] Percentages for categorical data by group

I can think of several ways to blunt force hard code what I want but I
imagine there is a command or two that can be easily combined to do
this:

I have a data frame with about 23000 observations. There first variable
is
the group to which the observation belongs (about 500 different groups).
The
second variable is a response for each observation that is a 1,2,3,4 or
5. I
want to be able to calculate the percentage of each group that choose
each
response. For example I want to know what percentage of group 1 (which
may
have a value of 34456) choose response 1 and so on.

Here is some code I wrote that generates a data frame like the one I
have.

pop <- matrix(1:100000)
groupIDs <- sample(pop,500)
groupVar <- sample(groupIDs,23000,replace=TRUE)
responseVar <- sample(1:5,23000,replace=TRUE)

example.data <- data.frame(groupVar,responseVar)

Is there a fast way to calculate these percentages beyond writing loops
to
manually count the responses for each of the groups?

Thanks,

EG

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list