[R] Crosstabbing multiple response data

Tue Feb 27 14:33:18 CET 2007

--- John Kane <jrkrideau at yahoo.ca> wrote:

> Thanks to everyone for this.  I was looking at the
> same problem last night and just was going to write
> a
> posting to R-help when I saw this.  
> 
> 
> --- Michael Wexler <wexler at yahoo.com> wrote:
> 
> > 
> > Thanks to Charles, Gabor, and a private message
> from
> > Frank E Harrell with some good ideas and help. 
> This
> > crossprod approach was very clever, I would never
> > have thought of it.
> > 
> > Best, Michael
> > 
> > 
> > ----- Original Message ----
> > From: Charles C. Berry <cberry at tajo.ucsd.edu>
> > To: Michael Wexler <wexler at yahoo.com>
> > Cc: r-help at stat.math.ethz.ch
> > Sent: Thursday, February 22, 2007 1:17:44 PM
> > Subject: Re: [R] Crosstabbing multiple response
> data
> > 
> > 
> > > res <- crossprod( as.matrix( ratings[ , -1] ) )
> > > diag(res) <- ""
> > > print(res, quote=F)
> >       att1 att2 att3
> > att1      2    1
> > att2 2         2
> > att3 1    2
> > > 
> > > res2 <- crossprod(as.matrix( ratings[ , -1])) *
> > 100 / nrow( ratings )
> > > res2[] <- paste( res2, "%", sep="" )
> > > diag(res2) <- ""
> > > print(res2, quote=F)
> >       att1 att2 att3
> > att1      50%  25%
> > att2 50%       50%
> > att3 25%  50%
> > >
> > 
> > Be sure to bone up on format and sprintf before
> > taking this into 
> > production.
> > 
> > On Thu, 22 Feb 2007, Michael Wexler wrote:
> > 
> > > Using R version 2.4.1 (2006-12-18) on Windows, I
> > have a dataset which resembles this:
> > >
> > > id    att1    att2    att3
> > > 1    1        1        0
> > > 2    1        0        0
> > > 3    0        1        1
> > > 4    1        1        1
> > >
> > > ratings <- data.frame(id = c(1,2,3,4), att1 =
> > c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1))
> > >
> > > I would like to get a cross tab of counts of
> > co-ocurrence, which might resemble this:
> > >
> > >    att1    att2    att3
> > > att1         2       1
> > > att2    2            2
> > > att3    1    2
> > >
> > > with the hope of understanding, at least
> pairwise,
> > what things "hang together".   (Yes, there are
> much,
> > much better ways to do this statistically
> including
> > clustering and binary corrected correlation, but
> the
> > audience I am working with asked for this version
> > for a specific reason.)
> > >
> > > (Later on, I would also like to convert to
> > percentages of the total unique pop, so the final
> > version of the table would be
> > >
> > >
> > >    att1    att2    att3
> > >
> > > att1         50%       25%
> > >
> > > att2    50%            50%
> > >
> > > att3    25%    50%
> > >
> > >
> > > But I can do this in excel if I can get the
> first
> > table out.)
> > >
> > > I have tried the reshape library, but could not
> > get anything resembling this (both on its own, as
> > well as feeding in to table()).  (I have also
> played
> > with transposing and using some comments from this
> > list from 2002 and 2004, but the questioners
> appear
> > to assume more knowledge than I have in use of R;
> > the example in the posting guide was also more
> > complex than I was ready for, I'm afraid.)
> > >
> > > Sample of some of my efforts:
> > > library(reshape)
> > > melt(ratings,id=c("id"))
> > >
> > > ds1 <- melt(ratings,id=c("id"))
> > > table(ds1$variable, ds1$variable) # returns only
> > rowcounts, 3 along diagonal
> > > xtabs(formula = value ~ ds1$variable +
> > ds1$variable , data=ds1) # returns only a single
> row
> > of collapsed counts, appears to not allow 1
> variable
> > in multiple uses
> > >
> > > I suspect I am close, so any nudges in the right
> > direction would be helpful.
> > >
> > > Thanks much, Michael
> > >
> > > PS: www.rseek.org is very impressive, I heartily
> > encourage its use.
> > >
> > >
> > >     [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> > reproducible code.
> > >
> > 
> > Charles C. Berry                        (858)
> > 534-2098
> >                                           Dept of
> > Family/Preventive Medicine
> > E mailto:cberry at tajo.ucsd.edu             UC San
> > Diego
> > http://biostat.ucsd.edu/~cberry/         La Jolla,
> > San Diego 92093-0901
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 	[[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> > reproducible code.
> > 
> 
> 
> __________________________________________________
> Do You Yahoo!?

> protection around 
> http://mail.yahoo.com 
>