[R] counts and percentage of multiple categorical columns in R

arun smartpink111 at yahoo.com
Sun Dec 29 20:56:54 CET 2013


Hi,
Another way is:
vec1 <- unique(unlist(dat1))
 res3 <- as.data.frame(t(sapply(dat1,function(x) {counts<- table(factor(x,levels=vec1)); percentage<-(counts/sum(counts))*100; paste0(counts,paste0("(",percentage,")"))})))
  colnames(res3) <- vec1
 
 identical(res3,as.data.frame(res2))
#[1] TRUE
A.K.




On Sunday, December 29, 2013 2:53 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,
Try:
dat1 <- read.table(text="fatfreemilk fatmilk halfmilk 2fatmilk
A A A A
A B B A
B A A A
C C C C
D A A A
A E A E
C A B A
A A A A
A B B A
A A B E",sep="",header=TRUE,stringsAsFactors=FALSE,check.names=FALSE)
 dat2 <- dat1
 dat2$id <- 1:nrow(dat2)
library(reshape2)
 res <- dcast(melt(dat2,id.var="id")[,-1],variable~value,length)
row.names(res) <- res[,1]
res1 <- res[,-1]
res2 <- as.matrix(res1)
 res2[]<- paste0(res2,paste0("(",(res2/rowSums(res2))*100),")")
 as.data.frame(res2)
#                A     B     C     D     E
#fatfreemilk 6(60) 1(10) 2(20) 1(10)  0(0)
#fatmilk     6(60) 2(20) 1(10)  0(0) 1(10)
#halfmilk    5(50) 4(40) 1(10)  0(0)  0(0)
#2fatmilk    7(70)  0(0) 1(10)  0(0) 2(20)
A.K.




On Sunday, December 29, 2013 1:07 PM, Jingxia Lin <jingxia08 at gmail.com> wrote:
Dear R helpers,

I have a data sheet (“milk”) with four types of milk from five brands (A,
B, C, D, E), the column shows the brands that each customer chose for each
type of the milk they bought. The data sheet goes like below. You can see
for some type of milk, no brand is chosen.

fatfreemilk fatmilk halfmilk 2fatmilk
A A A A
A B B A
B A A A
C C C C
D A A A
A E A E
C A B A
A A A A
A B B A
A A B E

I want to summarize each column so that for each type of milk, i know the
counts and percentages of the brands chosen for each milk type. I tried
"summary" in R, but the result is not shown nicely. How I can display the
result in a way like below:
A B C D E
fatfreemilk 6(60) 1(10) 2(20) 1(10) 0(0)
fatmilk 6(60) 2(20) 1(10) 0(10) 1(10)
halfmilk 5(50) 4(40) 1(10) 0(0) 0(0)
2fatmilk 7(70) 0(0) 1(10) 0(0) 2(20)

Thank you!

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list