[Rd] Degree of freedom issue on chi-square test of goodness of fit

Tue Sep 17 04:58:25 CEST 2024

I am running chi-square test of goodness of fit, the observation data was
well groupped to observation freq. I assume the data obey the normal
distribution N(μ σ^2), while μ and σ^2 are unknown. So I use their
estimation. Consequently, I calculated out the probabilities and input in
vector p

>x<-c(6,13,14,27,25,19,10,6)  #observation freq>p<-c(0.0505, 0.0874, 0.1533, 0.2088, 0.2088, 0.1533, 0.0874, 0.0505)
                                 #probabilities from normal distribution
test_result<-chisq.test(x=x,p=p)
test_result

    Chi-squared test for given probabilities

data:  x
X-squared = 1.8468, df = 7, p-value = 0.9678

from the out put, the degree of freedon is 7 (df=7),  because the data
was in 8 groups so df=8-1=7

That is perfect if the parameters (μ and σ^2) in normal distribution is
precise. However, in this case, they are unknow and I used ther
estimations.

 If the observation group is n, number of unknow parameters in distribution
is r, then in chi-square test, the degree of freedom should be n-r-1. But
in chisq.test() output, it is always n-1.

In this case, the degree of freedom should be 8-2-1=5 (there are two unknow
parameters). I have to run function pchisq() for rest calculation manually.

I propose to enhance function chisq.test() in degree of freedom. In
practice, there are unknow parameters in distribution function.

	[[alternative HTML version deleted]]