[R] chisq.test, basic question

Jan_Svatos@eurotel.cz Jan_Svatos at eurotel.cz
Wed Jul 31 08:52:54 CEST 2002


Hi,

your first use of chisq.test is correct.
But by multiplying by 100 and dividing by sum(m) (210), you analyze
different experiment
(with fewer "observations") and, in general, this is a _gross_ mistake.
In general, our example is (very basic, though) a well-known problem with
statistical vs. practical "significance".
Just try to chisq.test(2*m), chisq.test(3*m), etc.
With sufficiently large sample it is almost sure (in practical, not
mathematical meaning) that you get
statistically significant difference even when practical, "real-life"
difference is negligible.

An trivial example:
m<-matrix(c(100,101,110,115),2,2) #rows and cols are "practically"
independent
chisq.test(m)  #X-squared = 0.0065, df = 1, p-value = 0.9357
chisq.test(10*m)  #X-squared = 0.2823, df = 1, p-value = 0.5952
chisq.test(100*m)  #X-squared = 3.1241, df = 1, p-value = 0.07714
chisq.test(1000*m)  #X-squared = 31.551, df = 1, p-value = 1.943e-08

Therefore, your question about m2 is due to misunderstanding of
math-statistical principles behind chisq.test.

HTH,
Jan

-------------------------------------------------
designed for _monospaced_ font
-------------------------------------------------
/- Jan Svatos,  PhD         Sokolovska 855/225 -/
/- Data Analyst             Prague 9           -/
/- Eurotel Praha            190 00             -/
/- jan_svatos at eurotel.cz    Czechia            -/
-------------------------------------------------


- - - Original message: - - -
From: owner-r-help at stat.math.ethz.ch
Send: 30.7.2002 18:47:51
To: r-help <r-help at stat.math.ethz.ch>
Subject: [R] chisq.test, basic question

Dear R-users,
I have a question, which I?m not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I?ve got the counts of a 2 x 2 contingency table, and I'd like to test
the association:

m <-  matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <- c("P-", "P+")
m
#    R-  R+
# P- 15  32
# P+ 28 135

chisq.test(m)  # X-squared = 4.0027, df = 1, p-value = 0.04543

Is this the correct way to test association between P and R? (I haven?t
got the original data).
My problem is that if I use percentage, then I get different results:

m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158

Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same?  Or, am I using
chisq.test() wrongly?

Thanks in advance,

Juli


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._._

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list