[R] Chi-squared test
(Ted Harding)
Ted.Harding at nessie.mcc.ac.uk
Thu Nov 24 22:55:17 CET 2005
On 24-Nov-05 P Ehlers wrote:
> Bianca Vieru- Dimulescu wrote:
>> Hello,
>> I'm trying to calculate a chi-squared test to see if my data are
>> different from the theoretical distribution or not:
>>
>> chisq.test(rbind(c(79,52,69,71,82,87,95,74,55,78,49,60),
c(80,80,80,80,80,80,80,80,80,80,80,80)))
>>
>> Pearson's Chi-squared test
>>
>> data: rbind(c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60),
>> c(80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80))
>> X-squared = 17.6, df = 11, p-value = 0.09142
>>
>> Is this correct? If I'm doing the same thing using Excel I obtained
>> a different value of p.. (1.65778E-14)
>>
>> Thanks a lot,
>> Bianca
>
> It would be unusual to have 12 observed frequencies all equal to 80.
> So I'm guessing that you have a 12-category variable and want to
> test its fit to a discrete uniform distribution. I assume that your
> frequencies are
>
> x <- c(79, 52, 69, 71, 82, 87, 95, 74, 55, 78, 49, 60)
>
> Then just use
>
> chisq.test(x)
>
> (see the help page).
>
> (If those 80's are expected cell frequencies, they should sum to
> sum(x) = 851.)
>
> I don't know what Excel does.
>
> Peter
>
> Peter Ehlers
> University of Calgary
I'm rather with Peter on this question! I've tried to infer what
you're really trying to do.
My a-priori plausible hypothesis was that you have
k<-12
independent observations which have equal expected values
m<-rep(80,k)
and are observed as
x<-c(79,52,69,71,82,87,95,74,55,78,49,60)
On this basis, a chi-squared test Sum((O-E)^2/E) gives
C2<-sum(((x-m)^2)/m)
so C2 = 41.1375, and on this hypothesis the chi-squared would
have k=12 degrees of freedom. Then:
1-pchisq(C2,k)
## [1] 4.647553e-05
which is nowhere near the 1.65778E-14 you report from Excel.
Also, the result from Peter's chisq.test(x) is p = 0.0006468,
even further away.
So this makes me really wonder what you are doing.
The nearest I can get to your Excel result 1.65778E-14 is
ix<-x<m
prod(2*ppois(x[ix],m[ix]))*prod(2*(1-ppois(x[!ix],m[!ix])))
## 2.831963e-14
which is based on the guess that independent 2-sided Poisson
tests of agreement between O and E have been carried out on each
component, and the final P-value is the product of these P-values.
But this doesn't make a lot of sense from a statistical point
of view, so it's time to stop guessing!
Please tell us what hypothesis you are testing, what sort of
distribution the x-values are supposed to have, what the
repeated "80" values represent, and also please tell us
in detail what you asked Excel to do!
Then, perhaps, a useful reply can be made.
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 24-Nov-05 Time: 21:55:14
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list