[R] Type of multi-valued variable
Frank E Harrell Jr
fharrell at virginia.edu
Sat Feb 15 20:05:03 CET 2003
On Sat, 15 Feb 2003 14:41:09 +0100
Fan <xiao.gang.fan1 at libertysurf.fr> wrote:
> Thanks to Frank for pointing out that. There're so many "misc" in the
> package hmisc, I haven't yet explored all the functionalities !
>
> The implementation of mChoice / summary() is very interesting, and it could
> be a good starting point for adding more functionalities on the class mChoice.
>
> I'm having a little question on the usage of the function summary.formula() in hmisc:
> how to get the cross tabluations result like an array, as what xtabs does ?
>
> For example, suppose "titanic" is a dataset as the following:
> > str(titanic)
> `data.frame': 1313 obs. of 11 variables:
> $ pclass : Factor w/ 3 levels "1st","2nd","3rd": 1 1 1 1 1 1 1 1 1 1 ...
> $ survived : int 1 0 0 0 1 1 1 0 1 0 ...
> $ sex : Factor w/ 2 levels "female","male": 1 1 2 1 2 2 1 2 1 2 ...
> $ age : num 29.000 2.000 30.000 25.000 0.917 ...
> ...
>
> > ftable(xtabs( ~ sex + pclass + survived, data=titanic))
> survived 0 1
> sex pclass
> female 1st 9 134
> 2nd 13 94
> 3rd 134 79
> male 1st 120 59
> 2nd 148 25
> 3rd 440 58
>
> My question is how to get that with hmisc::summary() ?
> (survived could be a mChoice variable)
>
> Thanks in advance
> --
> Fan
>
> >
> > On Mon, 10 Feb 2003 21:51:50 +0100
> > Fan <xiao.gang.fan1 at libertysurf.fr> wrote:
> >
> > > Hi,
> > >
> > > I've read in the past a thead in the R discussion list
> > > about the multi-valued type variable (what was called checklist).
> > > At the moment Gregory had intention to add some general code
> > > in his gregmisc package.
> > >
> > > I'm wondering if there's some general code / packages available ?
> > >
> > > A general class for taking account this type of variable
> > > would be very useful in the domain of survey processings,
> > > as multi-responses questions are often used.
> > > The simple operations applied to these variables are holecount,
> > > cross tabulations with others variables, transformation to single
> > > coded variables like number of responses, etc.
> > >
> > > Thanks in advance for any help
> > > --
> > > Fan
> > >
> >
> > Fan, Take a look at p. 38-44 of http://hesweb1.med.virginia.edu/biostat/s/doc/summary.pdf where examples of the mChoice (multiple choice) function in Hmisc are given.
Hello Fan,
[This reminds me that I forgot to mail you a paper I promised - will do that on Monday - Sorry] For cross-classification, summarize in Hmisc is favored over summary(..., method='cross') and summary(..., method='cross') does not handle mChoice variables until I make a small change to use the new function about to be described. If you define
as.character.mChoice <- function(x) {
lev <- dimnames(x)[[2]]
d <- dim(x)
w <- rep('',d[1])
for(j in 1:d[2]) {
w <- paste(w,ifelse(w!='' & x[,j],',',''),
ifelse(x[,j],lev[j],''),sep='')
}
w
}
you can add the line
if(inherits(xi,'mChoice')) xi <- as.character(xi) else
before
if(is.matrix(xi) && ncol(xi) > 1)
in summary.formula and obtain an (ugly) output with method='cross'. Defining as.character.mChoice will fix summarize (here I'm using the titanic3 data frame):
n <- nrow(titanic3)
set.seed(1)
w <- c('good','bad','ugly')
a <- factor(sample(w,n,TRUE))
b <- factor(sample(w,n,TRUE))
m <- mChoice(a,b)
table(as.character(m))
bad bad,good bad,ugly good good,ugly ugly
146 275 284 150 319 135
attach(titanic3)
summarize(survived,llist(sex,pclass,m),
function(y)c(died=sum(y==0),lived=sum(y==1)))
sex pclass m survived lived
1 female 1st bad 0 14
2 female 1st bad,good 1 28
3 female 1st bad,ugly 0 34
4 female 1st good 3 21
5 female 1st good,ugly 1 33
6 female 1st ugly 0 9
7 female 2nd bad 2 13
8 female 2nd bad,good 1 28
9 female 2nd bad,ugly 4 13
10 female 2nd good 1 9
11 female 2nd good,ugly 4 19
. . . .
Here m is the multiple choice variable, not survived, but you get the idea.
These changes will be in the next version of Hmisc.
--
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
More information about the R-help
mailing list