[R] OT: A test with dependent samples.

Tue Feb 10 22:33:13 CET 2009

I am appealing to the general collective wisdom of this
list in respect of a statistics (rather than R) question.  This question
comes to me from a friend who is a veterinary oncologist.  In a study  
that
she is writing up there were 73 cats who were treated with a drug called
piroxicam.  None of the cats were observed to be subject to vomiting  
prior
to treatment; 12 of the cats were subject to vomiting after treatment
commenced.  She wants to be able to say that the treatment had a  
``significant''
impact with respect to this unwanted side-effect.

Initially she did a chi-squared test.  (Presumably on the matrix
matrix(c(73,0,61,12),2,2) --- she didn't give details and I didn't  
pursue
this.) I pointed out to her that because of the dependence --- same 73
cats pre- and post- treatment --- the chi-squared test is inappropriate.

So what *is* appropriate?  There is a dependence structure of some sort,
but it seems to me to be impossible to estimate.

After mulling it over for a long while (I'm slow!) I decided that a
non-parametric approach, along the following lines, makes sense:

We have 73 independent pairs of outcomes (a,b) where a or b is 0
if the cat didn't barf, and is 1 if it did barf.

We actually observe 61 (0,0) pairs and 12 (0,1) pairs.

If there is no effect from the piroxicam, then (0,1) and (1,0) are
equally likely.  So given that the outcome is in {(0,1),(1,0)} the
probability of each is 1/2.

Thus we have a sequence of 12 (0,1)-s where (under the null hypothesis)
the probability of each entry is 1/2.  Hence the probability of this
sequence is (1/2)^12 = 0.00024.  So the p-value of the (one-sided) test
is 0.00024.  Hence the result is ``significant'' at the usual levels,
and my vet friend is happy.

I would very much appreciate comments on my reasoning.  Have I made any
goof-ups, missed any obvious pit-falls?  Gone down a wrong garden path?

Is there a better approach?

Most importantly (!!!): Is there any literature in which this  
approach is
spelled out?  (The journal in which she wishes to publish will almost  
surely
demand a citation.  They *won't* want to see the reasoning spelled  
out in
the paper.)

I would conjecture that this sort of scenario must arise reasonably  
often
in medical statistics and the suggested approach (if it is indeed valid
and sensible) would be ``standard''.  It might even have a name!  But I
have no idea where to start looking, so I thought I'd ask this  
wonderfully
learned list.

Thanks for any input.

	cheers,

		Rolf Turner

######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}