[R] Is it safe? Cochran etc

Sun Oct 10 13:03:13 CEST 2004

Hey

On Sat, 9 Oct 2004, Frederico Zanqueta Poleto wrote:

>Dan,
>
>I don't know what is the theory behind this "hybrid" option and what 
>consists the Cochran conditions.
>
>However, I think even if you suppose the asymptotic distribution is not 
>too accurate, because your sampled 1, there is a too strong association 
>of A and B, as this can be noticed by conservative methods such as using 
>the Yates continuity correction or Wald/Neyman tests (that usually does 
>not reject the null hypothesis of no interaction much more than the 
>Pearson/score test and likelihood ratio test, in this order) of the log 
>odds.

So I read from this that Wald/Neyman tests of the log odds is
conservative, but not by much?

>Both procedures inflate the pvalues, but not sufficiently to change your 
>conclusion as you can notice by:
>
>> chisq.test(dat,correct=FALSE)
>
>        Pearson's Chi-squared test
>
>data:  dat 
>
>X-squared = 6.0115, df = 1, p-value = 0.01421
>
>> chisq.test(dat)
>
>        Pearson's Chi-squared test with Yates' continuity correction
>
>data:  dat 
>
>X-squared = 5.1584, df = 1, p-value = 0.02313

OK, I see the p-value was inflated, making Yates more conservative. 

>> 1-pchisq( (log(878702/(13714*506))^2)/(1+1/878702+1/13714+1/506) ,1)
># Wald test of null log odds
>
>[1] 0.03898049

OK... 

logOD <- log((1*878702)/(13714*506))

same as

logOD <- log( (    1/   506) /
              (13714/878702)
            )

The odds ratio has an approximatly normal distribution 

Then you divide the logOD^2 by the variance (same as
logOD/StDev?)... Nope...

Is the above just the nature of the test?

Using data from the paper on the description of the log odds I have...

dat <- matrix(c(141,928,420,13525),nr=2)

dat

1-pchisq((log((141*13525)/(928*420))^2)/(1/141+1/13525+1/928+1/420),1)

[1] 0

a very cautious ... ok....

The data does not have a logOD of 1, very strongly, so the p-value is 0
for that test. 

How does the above differ from just saying ...

chisq.test(dat)

Other than the latter appears to never go below 2.2e-16 and the former
hapily says 0.

>The book "Categorical data analysis" from Agresti (2002) has an ample 
>discussion about tests like this on chapters 1 (basics and one sample) 
>and 3 (two variables). You may look there if you still have doubts about 
>this tests.

I should have got that book last week when I was still enthusiastic about
this problem.

Thanks very much for your help,
Dan.

>Sincerely,
>
>