[R] Crosstabbing multiple response data
Charles C. Berry
cberry at tajo.ucsd.edu
Thu Feb 22 19:17:44 CET 2007
> res <- crossprod( as.matrix( ratings[ , -1] ) )
> diag(res) <- ""
> print(res, quote=F)
att1 att2 att3
att1 2 1
att2 2 2
att3 1 2
>
> res2 <- crossprod(as.matrix( ratings[ , -1])) * 100 / nrow( ratings )
> res2[] <- paste( res2, "%", sep="" )
> diag(res2) <- ""
> print(res2, quote=F)
att1 att2 att3
att1 50% 25%
att2 50% 50%
att3 25% 50%
>
Be sure to bone up on format and sprintf before taking this into
production.
On Thu, 22 Feb 2007, Michael Wexler wrote:
> Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which resembles this:
>
> id att1 att2 att3
> 1 1 1 0
> 2 1 0 0
> 3 0 1 1
> 4 1 1 1
>
> ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1))
>
> I would like to get a cross tab of counts of co-ocurrence, which might resemble this:
>
> att1 att2 att3
> att1 2 1
> att2 2 2
> att3 1 2
>
> with the hope of understanding, at least pairwise, what things "hang together". (Yes, there are much, much better ways to do this statistically including clustering and binary corrected correlation, but the audience I am working with asked for this version for a specific reason.)
>
> (Later on, I would also like to convert to percentages of the total unique pop, so the final version of the table would be
>
>
> att1 att2 att3
>
> att1 50% 25%
>
> att2 50% 50%
>
> att3 25% 50%
>
>
> But I can do this in excel if I can get the first table out.)
>
> I have tried the reshape library, but could not get anything resembling this (both on its own, as well as feeding in to table()). (I have also played with transposing and using some comments from this list from 2002 and 2004, but the questioners appear to assume more knowledge than I have in use of R; the example in the posting guide was also more complex than I was ready for, I'm afraid.)
>
> Sample of some of my efforts:
> library(reshape)
> melt(ratings,id=c("id"))
>
> ds1 <- melt(ratings,id=c("id"))
> table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along diagonal
> xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns only a single row of collapsed counts, appears to not allow 1 variable in multiple uses
>
> I suspect I am close, so any nudges in the right direction would be helpful.
>
> Thanks much, Michael
>
> PS: www.rseek.org is very impressive, I heartily encourage its use.
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list