[R] goodness of fit between two samples of size N (discrete variable)

Mon Apr 13 01:45:07 CEST 2009

On Apr 12, 2009, at 3:09 PM, jose romero wrote:

>
> Hello list:
>
> I generate by simulation (using different procedures) two sample  
> vectors of size N, each corresponding to a discrete variable and I  
> want to text if these samples can be considered as having the same  
> probability distribution (which is unknown).  What is the best test  
> for that?
> I've read that Kolmogorov-Smirnov and Anderson-Darling tests are  
> restricted to continuous data (http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf 
> ), while chi-square can handle discrete data, but how do i test (in  
> R) equivalence of ditribution in 2 samples using it? Are there  
> better tests than those i mentioned?

The question of whether two discrete samples are independent,  
conditional on their joint marginals is generally handled with a chi- 
square test. The theoretical distribution is only approximately chi- 
square, but is seems close enough that most people will accept it.  
This is not a test of "equivalence". Ricci deals with the cases where  
one sample is fitted to a theoretical distribution. You do not seem to  
have that situation.

?chisq.test

I find myself wondering to what purpose you are seeking these answers.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT