# [R] comparing matched proportions using glm

Charles C. Berry cberry at tajo.ucsd.edu
Mon Oct 8 18:55:43 CEST 2007

On Mon, 8 Oct 2007, Corry Gellatly wrote:

>
> question. You mention putting the data into a 2x2x3 for log-linear
> model, however my lists have many more than 3 strata, actually
> thousands. I am trying to work out whether the proportions in list 1
> tend to be equal to the proportions in list 2, in a kind of matched
> pairs proportional test.

Sounds like you are not sure what test to use. A sit-down with a
statistician might be in order...

> Is the log-linear approach possible with a
> 2x2x1000 table, for example?

Yes, but there are details that depend on what null hypothesis you wish to
test and for which alternatives you need to have good power.

And depending on those details some other approach might be better.

To use loglin(), you will need to specify a 'start' argument to test

If all you end up doing is computing the collection of McNemar Chi-Square
Statistics and operating on that, you would be better off coding the
computation directly or using

apply( tab, 3 , mcnemar.test )

if 'tab' is your 2 x 2 x k table.

HTH,

Chuck

Or would it be better to pursue the glm
> route, using the surrogate Poisson model, as you suggested?
>
> Best regards,
>
> Corry
>
>
>
>> -----Original Message-----
>> From: Charles C. Berry [mailto:cberry at tajo.ucsd.edu]
>> Sent: 04 October 2007 21:47
>> To: Corry Gellatly
>> Cc: r-help at r-project.org
>> Subject: Re: [R] comparing matched proportions using glm
>>
>> On Thu, 4 Oct 2007, Corry Gellatly wrote:
>>
>>>
>>> Dear R users,
>>>
>>> Is it possible to use a generalized linear model to do a binomial
>>> comparison of one list of proportions with a matched list of
>>> proportions to test for a difference?
>>>
>>> So, for example:
>>>
>>> list 1  		list 2
>>>
>>> a1  |  b1        	a2 |  b2
>>>
>>> 3   |  4          7  |  9
>>> 6   |  7          5  |  1
>>> 9   |  1          3  |  1
>>>
>>>
>>> I want to compare list 1 with list 2 and the samples are matched.
>>
>>
>> Meaning that
>>
>> 	 3     4          7    9
>>
>> are the _counts_ in one stratum of three in all?
>>
>> And you have an hypothesis that claims the proportions are
>> equal in each stratum??
>>
>> The obvious candidate for that setup is a log-linear model for
>> the counts in a 2 by 2 by 3 table.
>>
>> See
>>
>> 	?loglin
>>
>> and
>>
>> 	?loglm (in MASS)
>>
>> and the refernces therein.
>>
>> You can do this type of work in glm() if you understand
>> surrogate Poisson models as outlined in
>>
>> McCullagh P. and Nelder, J. A. (1989) Generalized Linear
>> Models. London:
>> Chapman and Hall.
>>
>> HTH,
>>
>> Chuck
>>
>>> Obviously, I could add the columns and do a binomial test, i.e.
>>> prop.test(c(18,15),c(30,26)), however, I have a large
>> dataset so this
>>> would reduce the power of my analysis. I could compare the
>> ratios, i.e.
>>> a1/(a1+b1) compared to a2/(a2+b2) for the samples in each list,
>>> however, this does not account for the difference in sample sizes
>>> between samples in each list.
>>>
>>> I have tried a glm where I bind a2 and b2 as the y variable, i.e.
>>> y<-cbind(a2,b2) and also bind a1 and b1 as the x variable, i.e.
>>> y<-cbind(a1,b1) and run <-glm(y~x,binomial)
>>>
>>> I get this type of output:
>>>
>>> 	Call:
>>> 	glm(formula = y ~ x, family = binomial)
>>>
>>> 	Deviance Residuals:
>>> 	     Min        1Q    Median        3Q       Max
>>> 	-3.20426  -0.72686  -0.01822   0.68320   4.05035
>>>
>>> 	Coefficients:
>>> 	             Estimate Std. Error z value Pr(>|z|)
>>> 	(Intercept)  0.178369   0.186421   0.957    0.339
>>> 	xa1     	 0.008109   0.017430   0.465    0.642
>>> 	xb1		-0.026666   0.018153  -1.469    0.142
>>>
>>> 	(Dispersion parameter for binomial family taken to be 1)
>>>
>>> 	    Null deviance: 565.14  on 467  degrees of freedom
>>> 	Residual deviance: 559.69  on 465  degrees of freedom
>>> 	AIC: 1883.3
>>>
>>> 	Number of Fisher Scoring iterations: 3
>>>
>>>
>>> Is this output meaningful? It seems that y is not compared directly
>>> with x, but rather compared with a1 and b1, which is not intended?
>>>
>>> I wonder if this is a suitable approach to the problem? I'll be very
>>> grateful for any advice or suggestions.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> Charles C. Berry                            (858) 534-2098
>>                                             Dept of
>> Family/Preventive Medicine
>> E mailto:cberry at tajo.ucsd.edu	            UC San Diego
>> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San
>> Diego 92093-0901
>>
>>
>>
>

Charles C. Berry                            (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901