[R] chisq.test, basic question
Jan_Svatos@eurotel.cz
Jan_Svatos at eurotel.cz
Wed Jul 31 08:52:54 CEST 2002
Hi,
your first use of chisq.test is correct.
But by multiplying by 100 and dividing by sum(m) (210), you analyze
different experiment
(with fewer "observations") and, in general, this is a _gross_ mistake.
In general, our example is (very basic, though) a well-known problem with
statistical vs. practical "significance".
Just try to chisq.test(2*m), chisq.test(3*m), etc.
With sufficiently large sample it is almost sure (in practical, not
mathematical meaning) that you get
statistically significant difference even when practical, "real-life"
difference is negligible.
An trivial example:
m<-matrix(c(100,101,110,115),2,2) #rows and cols are "practically"
independent
chisq.test(m) #X-squared = 0.0065, df = 1, p-value = 0.9357
chisq.test(10*m) #X-squared = 0.2823, df = 1, p-value = 0.5952
chisq.test(100*m) #X-squared = 3.1241, df = 1, p-value = 0.07714
chisq.test(1000*m) #X-squared = 31.551, df = 1, p-value = 1.943e-08
Therefore, your question about m2 is due to misunderstanding of
math-statistical principles behind chisq.test.
HTH,
Jan
-------------------------------------------------
designed for _monospaced_ font
-------------------------------------------------
/- Jan Svatos, PhD Sokolovska 855/225 -/
/- Data Analyst Prague 9 -/
/- Eurotel Praha 190 00 -/
/- jan_svatos at eurotel.cz Czechia -/
-------------------------------------------------
- - - Original message: - - -
From: owner-r-help at stat.math.ethz.ch
Send: 30.7.2002 18:47:51
To: r-help <r-help at stat.math.ethz.ch>
Subject: [R] chisq.test, basic question
Dear R-users,
I have a question, which I?m not sure if it is related to my
misunderstanding of basic statistics, or my misunderstanding of R, or
both.
I?ve got the counts of a 2 x 2 contingency table, and I'd like to test
the association:
m <- matrix(c(15,28,32,135), 2, 2)
colnames(m) <- c("R-", "R+"); rownames(m) <- c("P-", "P+")
m
# R- R+
# P- 15 32
# P+ 28 135
chisq.test(m) # X-squared = 4.0027, df = 1, p-value = 0.04543
Is this the correct way to test association between P and R? (I haven?t
got the original data).
My problem is that if I use percentage, then I get different results:
m2 <- 100*m/sum(m) #
chisq.test(round(m2)) # X-squared = 1.5318, df = 1, p-value = 0.2158
Should this give about the same (a part from the rounding)? Should the
degree of association between P and R be he same? Or, am I using
chisq.test() wrongly?
Thanks in advance,
Juli
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list