[R] the dimname of a table
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Jun 23 23:00:38 CEST 2005
On 6/23/05, Marc Schwartz <MSchwartz at mednetstudy.com> wrote:
> On Thu, 2005-06-23 at 23:12 +0800, ronggui wrote:
> > i have a data frame(dat) which has many variables.and i use the
> > following script to get the crosstable.
> >
> > >danx2<-c("x1.1","x1.2","x1.3","x1.4","x1.5","x2","x4","x5","x6","x7","x8.1","x8.2","x8.3","x8.4","x11",
> > "x13","x17","x19","x20","x21")
> > >indep<-c("x23","x24","x25","x26","x27","x28.1","x28.2","x29")
> > >for (k in indep){
> > for (i in danx2){
> > a<-chisq.test(dat[,i],dat[,k])$p.v<=0.05
> > if (a)
> > {CrossTable(dat[,i],dat[,k],chisq=T,format="SPSS");cat(rep("=",50),"\n","\n")}
> > }
> >
> > it has a little pitfall:the dimnames of table is dat[,i] and
> > dat[,k],but i want it to be like x2,x23...
> > is there any good way to do this?
> > and in the command CrossTable(dat[,i],dat[,k],chisq=T,format="SPSS")
> > in the loop,is there any other way to get the variable other than
> > dat[,i] and dat[,k]?
> > thank you !
>
> Hi,
>
> I am in between meetings here. Sorry for the delay in my reply to your
> query.
>
> The best solution is for me to add two new args to CrossTable() to allow
> you to specify these names explicitly, rather than having them as the
> way they are now, which simply takes the x and y args and does:
>
> RowData <- deparse(substitute(x))
> ColData <- deparse(substitute(y))
>
> The result is that whatever is passed as the x and y arguments, will be
> used as the titles for the row and column labels as you have noted.
>
> In the mean time, I am attaching an update to CrossTable (which I have
> not extensively tested yet), that you can source() into R via the
> console. The update has two new args called "RowData" and "ColData"
> which will default to NULL, so as to not impact current default
> behavior. You can then set these as part of your loop by passing the
> index values.
>
>
> Using one of the examples in ?CrossTable:
>
> > CrossTable(infert$education, infert$induced, RowData = "Education",
> ColData = "Induced")
>
>
> Cell Contents
> |-------------------------|
> | N |
> | Chi-square contribution |
> | N / Row Total |
> | N / Col Total |
> | N / Table Total |
> |-------------------------|
>
>
> Total Observations in Table: 248
>
>
> | Induced
> Education | 0 | 1 | 2 | Row Total |
> -------------|-----------|-----------|-----------|-----------|
> 0-5yrs | 4 | 2 | 6 | 12 |
> | 1.232 | 0.506 | 9.898 | |
> | 0.333 | 0.167 | 0.500 | 0.048 |
> | 0.028 | 0.029 | 0.162 | |
> | 0.016 | 0.008 | 0.024 | |
> -------------|-----------|-----------|-----------|-----------|
> 6-11yrs | 78 | 27 | 15 | 120 |
> | 1.121 | 1.059 | 0.471 | |
> | 0.650 | 0.225 | 0.125 | 0.484 |
> | 0.545 | 0.397 | 0.405 | |
> | 0.315 | 0.109 | 0.060 | |
> -------------|-----------|-----------|-----------|-----------|
> 12+ yrs | 61 | 39 | 16 | 116 |
> | 0.518 | 1.627 | 0.099 | |
> | 0.526 | 0.336 | 0.138 | 0.468 |
> | 0.427 | 0.574 | 0.432 | |
> | 0.246 | 0.157 | 0.065 | |
> -------------|-----------|-----------|-----------|-----------|
> Column Total | 143 | 68 | 37 | 248 |
> | 0.577 | 0.274 | 0.149 | |
> -------------|-----------|-----------|-----------|-----------|
>
>
> Let me know if this works or you find a problem. I will do further
> testing here as soon as time permits and get an update to Greg and Nitin
> to include into gregmisc.
>
1. Assuming that the names of the data frame, dat, are
set as desired in the output, then with RowData= and ColData=
arguments implemented the key portion of the poster's problem
could be written
CrossTable(dat[,i], dat[,j],
RowData = names(dat)[i], ColData = names(dat)[j])
2. However, instead of naming the new args as RowData= and
ColData= they might be named consistently with the table
function using the argument named dnn= . In that case the
above could be shortened to:
CrossTable(dat[,i], dat[,j], dnn = names(dat)[c(i,j)])
3. Even better would be to allow a data frame argument with
automatic use of the names in that data frame in which
case the example becomes just:
CrossTable(dat[,c(i,j)])
By the way, here is a solution that can be used even with
the existing version of CrossTable. The first portion just
sets up test data assuming we want to specify columns 1 and
3. The second portion substitutes the column names into the
expression giving s and then evaluates s in the context of dat.
library(gmodels)
dat <- data.frame(a = 1:2, b = 1:2, c = 1:2)
i <- 1; j <- 3
nm <- lapply(names(dat), as.name)
s <- substitute(CrossTable(coli,colj,chisq=TRUE,format="SPSS"),
list(coli=nm[[i]], colj=nm[[j]]) )
eval(s, dat)
More information about the R-help
mailing list