[R] Contingency tables as data frames
presnell@stat.ufl.edu
presnell at stat.ufl.edu
Wed Mar 1 10:25:00 CET 2000
{again a message that was sent to owner-r-help (which is me, currently)
why on earth ???!??!?
reply to R-help or the original sender Brett Presnell; }
I'm teaching a categorical data analysis course this term, and a minor
"problem" has resurfaced that I have often thought about before. This
applies equally to Splus I suppose, but my undergrads aren't using
Splus.
It seems natural to read/represent a contingency table as a data
frame, with one column representing the cell counts (as in the example
appended below (data taken from Agresti, "An Introduction to
Categorical Data Analysis"). However, functions like ftable,
mantelhaen.test, chisq.test, fisher.test, etc. don't work naturally
with this representation, and instead require the user to first
manipulate the data, say by using tapply to convert the data into an
array. This is not difficult of course, but it's one of those things
that I'd rather not have to explain to students, who usually need to
be focusing on other things.
So, am I missing something obvious (not unlikely), or would it be a
good idea to extend the methods/arguments of these functions to
analyze/manipulate data represented in this way without any
preprocessing by the user? It seems that a "count" (or "weight" or
"freq" or whatever) argument would do it in most cases.
Funny, I can't help but wonder if the answer from those who have thought
about this more deeply than I have might be "it's a can of worms".
--
Brett Presnell
Department of Statistics
University of Florida
(presnell at stat.ufl.edu)
City Smoker Cancer Count
Beijing Yes Yes 126
Beijing Yes No 100
Beijing No Yes 35
Beijing No No 61
Shanghai Yes Yes 908
Shanghai Yes No 688
Shanghai No Yes 497
Shanghai No No 807
Shenyang Yes Yes 913
Shenyang Yes No 747
Shenyang No Yes 336
Shenyang No No 598
Nanjing Yes Yes 235
Nanjing Yes No 172
Nanjing No Yes 58
Nanjing No No 121
Harbin Yes Yes 402
Harbin Yes No 308
Harbin No Yes 121
Harbin No No 215
Zhengzhou Yes Yes 182
Zhengzhou Yes No 156
Zhengzhou No Yes 72
Zhengzhou No No 98
Taiyuan Yes Yes 60
Taiyuan Yes No 99
Taiyuan No Yes 11
Taiyuan No No 43
Nanchang Yes Yes 104
Nanchang Yes No 89
Nanchang No Yes 21
Nanchang No No 36
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list