# [R] Proportion test in three-chices experiment

Spencer Graves spencer.graves at pdf.com
Sun Jul 17 21:47:16 CEST 2005

```Hi, Rafael:

At this point, it might help if you try the posting guide!
constructing a toy example with real numbers, try some of the R
facilities discussed, and explain why you aren't sure they will solve
through that guide, and if you don't, the exercise could make it easier
for someone else to suggest something you actually find useful.

spencer graves

Jonathan Baron wrote:

> On 07/17/05 20:12, Rafael Laboissiere wrote:
>
>
>>using the BTm function.  I realize that my description of both the
>>experiment and the involved issue was not clear.  Let me try again:
>>
>>My subjects do a recognition task where I present stimuli belonging to
>>three different classes (let us say A, B, and C).  There are many of
>>them.  Subjects are asked to recognize each stimulus as belonging to one
>>of the three classes (forced-choice design).  This is done under two
>>different conditions (say conditions 1 and 2).  I end up with matrices of
>>counts like this (in R notation):
>>
>># under condition 1
>>c1 <- t (matrix (c (c1AA, c1AB, c1AC,
>>                    c1BA, c1BB, c1BC,
>>		    c1CA, c1CB, c1CC), nc = 3))
>># under condition 2
>>c2 <- t (matrix (c (c2AA, c2AB, c2AC,
>>                    c2BA, c2BB, c2BC,
>>		    c2CA, c2CB, c2CC), nc = 3))
>>
>>where "cijk" is the number of times the subject gave answer k when
>>presented with a stimulus of class j, under condition i.
>>
>>The issue is to test whether subjects perform better (in the sense of a
>>higher recognition score) in condition 1 compared with condition 2.  My
>>first idea was to test the global recognition rate, which could be
>>computed as:
>>
>># under condition 1
>>r1 <- sum (diag (c1)) / sum (c1)
>># under condition 2
>>r2 <- sum (diag (c2)) / sum (c2)
>>
>>The null hypothesis is that r1 is not different from r2. I guess that I
>>could test it with the chisq.test function, like this:
>>
>>p1 <- sum (diag (c1))
>>q1 <- sum (c1) - p1
>>p2 <- sum (diag (c2))
>>q2 <- sum (c2) - p2
>>chisq.test (matrix (c(p1, q1, p2, q2), nc = 2))
>>
>>What do you think?
>>
>>I also thought about testing the triples like [c1AA, c1AB, c1AC] against
>>[c2AA, c2AB, c2AC], hence my original question.
>
>
> You still aren't saying whether you are doing this for each
> subject for the entire data set summed over subjects.  If the
> latter, are you worried about subject variance?  Do you think it
> possible that some subjects might show better performance in
> condition 2?  Would you be happy if you tested a single subject
> and got the result?  If subject variance is an issue, then you
> need to test "across subjects."  One way to do that is to
> compute some performance measure for each subject and each
> condition and then do a matched-pairs t test across subjects.
>
> The method you suggest requires several assumptions, and I don't
> know if these are reasonable.  The problem is in using a sum of
> the diagonal (p1) and off-diagonal entries (q1) in the table.
> This may work if you have no reason to think that c2 is better,
> ever.  In that case, all you need is a measure that varies
> monotonically with the true measure, whatever it is.  You need
> also to assume that c1 and c2 do not differ in response biases,
> and that it could not be the case that one of the diagonal cells
> is better in c1 and another is better in c2.
>
> I have not studied these issues much since my PhD thesis (1970!),
> but then the usual approach was to develop a sensible model of
> the task and then use some parameter of the model as the
> measure.  Perhaps this is over-kill for what you are doing, but I
> don't know.  For example, one model says that the subject either
> knows the answer or guesses, and the guesses are distributed
> across the three categories according to biases that are specific
> to the condition, but knowing the answer is independent of the
> category.  (You can test the assumptions of this model.)  Another
> model (popular in 1970) is Luce's choice theory, which is similar
> to the first but uses multiplication.  If I remember correctly
> (which I probably don't) you would exactly what you propose but
> after taking the logs of the frequencies.
>
> It is possible to get different, even opposite, results using
> logs than you would get with your proposal.  Likewise, it is
> possible to get opposite results if you ignore response bias, and
> if the conditions differ in response bias.
>
> The suggestion I made based on the idea of inter-rater agreement
> implies a rough-and-ready model similar to the first.  It does
> take response bias into account.
>
> Jon

--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA

spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel:  408-938-4420
Fax: 408-280-7915

```