[R] creating a contingency table from a data.frame automatically (NOT BY HAND)

arun smartpink111 at yahoo.com
Fri Aug 10 15:20:57 CEST 2012


HI,

Try this:
n<-100
dat1<-data.frame(hunting.prev=sample(c("success","fail"),n, replace=TRUE),groupsize=sample(c("small","large"),n,replace=TRUE),dogs=sample(c("yes","no"),n,replace=TRUE),
guns=sample(c("yes","no"),n,replace=TRUE))
mytable<-xtabs(~hunting.prev+groupsize+dogs+guns,data=dat1)

 ftable(mytable)
                            guns no yes
hunting.prev groupsize dogs            
fail         large     no         5  10
                       yes        3   9
             small     no         8   7
                       yes        6   2
success      large     no        10   3
                       yes        7  10
             small     no         7   6
                       yes        6   1
 summary(mytable)
#Call: xtabs(formula = ~hunting.prev + groupsize + dogs + guns, data = dat1)
#Number of cases in table: 100 
#Number of factors: 4 
#Test for independence of all factors:
  #  Chisq = 16.749, df = 11, p-value = 0.1155
   # Chi-squared approximation may be incorrect

A.K.




----- Original Message -----
From: Sacha Viquerat <dawa.ya.moto at googlemail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, August 10, 2012 6:48 AM
Subject: [R] creating a contingency table from a data.frame automatically (NOT BY HAND)

Hello there!
I am still struggling with a binomial response over all categorical variables (some of them with 3 levels, most with 2 levels). After initial struggles with glm's (struggle coming from the data, not the actual analysis) I have decided to prefer contingency tables. I have my data such as:

response:
hunting.prev=c("success","fail","success","success","success","fail",...)

one of 21 surveyed variables:
groupsize=c("small","large","small","small","small","large"...)
...

now...
It is intuitive to me that I will have to split up each variable by its level(s), thus creating 2 new variables for groupsize (as an example) holding the counts of small hunting parties when the hunting.prev was a success and so on. I could write a function to do that for me, however, never intend to reinvent the wheel. I would like my data to look like that:

hunting prev    groupsize-small    groupsize-large    dogs-yes dogs-no    guns-yes    guns-no...
success    12    2    4    14    23    12...
failure    1    6    34    0    12    3...

of course, hunting.prev would only be needed to create the index via hunting.prev=="success" and is here used to indicate what each row means. My questions would be:

a) how to count and split each categorical variable by a response variable, how to create a 2x20something (contingency) table and how far a prop.test() approach or a chi² may be more appropriate to actually analyze the data.

b) how do you guys create R output so that it's formatted in nice columns and rows?

Hope to see help,
Thanks!

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list