[R] contingency tables in R

Mark Myatt mark at myatt.demon.co.uk
Mon Apr 16 14:14:38 CEST 2001

Kurt Hornik <Kurt.Hornik at ci.tuwien.ac.at> writes:
>>>>>> Patrick Ball writes:
>> Dear List:
>> Most of the analysis I do involves contingency tables.  I am migrating
>> to R from Stata and I have a number of questions about using
>> contingency tables in R.  I suspect that most of the things I want to
>> do are very short R scripts that people on this list probably have.  I
>> wonder if you would be willing to share them.
>> First, the presentation of tables by table() is not analysis-ready.
>> Is there a way to output the table with the marginals, by cell, row or
>> column proportions, with the test statistics (especially the chi^2 and
>> the log-likelihood chi^2), residuals, cross product, and odds ratio?
>Not in one monolithic function, I think, and I am not sure I would like
>to have such a thing, see below.  But the pieces are all there:
>* Use margin.table() and prop.table() to obtain margins and proportions,
>  respectively.
>* Use chisq.test() [in package ctest] for the chisq analysis (test
>  statistic, p-value, chisq residuals)
>* Use loglin() for the LR chisq and residuals.

Most of this is in table() and chisq.test(). Here is an example from

twoWay <- function( x=NA, y=NA, userDefined=NA ){

  if (is.na(userDefined)){
    result <- chisq.test(table(x,y))
    result <- chisq.test(userDefined)

  print (result)
  observed <-result$observed
  expected <- result$expected
  chi.table <- ((observed - expected)^2)/expected
  row.sum <- apply(observed,1,sum)
  col.sum <- apply(observed,2,sum)
  N <- sum(observed)

  ## put in the marginals and names  ... create fullArray
  fullArray <- cbind(observed,row.sum)
  fullArray <- rbind(fullArray,c(col.sum,N))
  rownames(fullArray) <-  c(rownames(observed), "Total")
  colnames(fullArray) <- c(colnames(observed), "Total")

  ## make the tables of proportions
  proportion <- fullArray/N
  row.proportion <- fullArray/c(row.sum,N)
  col.proportion <- t(t(fullArray)/c(col.sum, N))

  return(list(fA=fullArray, e=expected, ct=chi.table, p=proportion,
rp=row.proportion, cp=col.proportion))

This needs a nice print() method.

>* Not sure about which odds ratios you want.  Function mantelhaen.test()
>  in package ctest does exact conditional ones for 2 by 2 tables.

And fisher.test(). Have a look at my intro text
(http://www.myatt.demon.co.uk) for examples of calculating RRs and ORs
in tabulating functions.

>It really depends on how your data is set up.  If you have the raw
>values in a data frame, I would actually recommend using xtabs() rather
>than table().  Try e.g.
>     data(esoph)
>     x <- xtabs(cbind(ncases, ncontrols) ~ ., data = esoph)
>     x
>     summary(x)
>the last one prints ``useful'' summary information.
>To obtain pretty-printed output from multi-way tables, use ftable().
>> I also like to make tables that have summary statistics of a given
>> variable in the columns (mean, s.d., etc.)  with each row being the
>> value for a sub group of the data.  How do you do this in R?
>Use aggregate().

Or by().

>> The most complicated piece of this is contingency tables done with
>> sample data.  The sampling involves several strata with different
>> sampling weights.  Calculating the cell (or row or column)
>> probabilities is relatively easy, but the other statistics can be
>> complicated (the design effect, the finite population correction, the
>> various chi^2s, and the standard errors and confidence
>> intervals). Also, I sometimes make these tables with summary
>> statistics in place of counts or population proportions.
>> Is there any way to do this stuff in R without hacking it all myself?
>The pieces are all there, I think, and it should be fairly simple to
>combine them to reflect your personal preferences for displaying
>categorical information etc.

Yes, the key bits are all there. It should not take to long to get a
function that meets your own needs.


Mark Myatt

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list