[R] Tests on contingency tables

Jacques VESLOT jacques.veslot at cirad.fr
Tue Feb 15 12:35:36 CET 2005


Dear all,

I have a dataset with qualitative variables (factors) and I want to test the
null hypothesis of independance between two variables for each pair by using
appropriate tests on contingency tables.

I first applied chisq.test and obtained dependance in almost all cases with
extremely small p-values and warning messages.

> chisq.test(table(data$ins.f, data$ins.st))$p.val
[1] 4.811263e-100
Warning message:
Chi-squared approximation may be incorrect in: chisq.test(table(data$ins.f,
data$ins.st))

I then turned to Fisher's Exact Test for Count Data, but I got only error
messages such as:

Error in fisher.test(table(data$ins.f, data$ins.st)) :
        FEXACT error 501.
The hash table key cannot be computed because the largest key
is larger than the largest representable int.
The algorithm cannot proceed.
Reduce the workspace size or use another algorithm.

maybe cause the dimensions of contingency tables are too large (?).

> dim(table(data$ins.f, data$ins.st))
[1] 10  8

I then tried likelihood-ratio G-statistic on contingency table (g.stats()
from hierfstat package), as follows:

> g.stats(data.frame(as.numeric(data$ins.f),as.numeric(data$ins.s)))$g.stats
[1] 486.1993

and I replaced in Chi2 distribution function to get p-value:

> 1-pchisq(486.199, df=63)
[1] 0


Is there a better way to perform this or a more appropriate function
dedicated to tests on large-dimensioned contingency tables ?

Thanks in advance,

Jacques VESLOT




More information about the R-help mailing list