Thanks for your reply.
As to g2 and g3:
g2: a + d + c + a:d + a:c + d:c
g3: a + d + c + a:d + a:c + d:c + a:d:c
The only difference between g2 and g3 is "a:d:c",which refers to "case depends on drug but in the same way for different levels of age". And anova tests whether this "only difference"is significant.
But as to g1 and g3:
g1: a + d + c + a:d + a:c
g3: a + d + c + a:d + a:c + d:c + a:d:c
The only difference between g1 and g3 is "d:c + a:d:c".
What's "d:c + a:d:c" refers to?
Thanks.
At 2013-04-29 14:33:25,"Achim Zeileis" wrote:
>On Mon, 29 Apr 2013, meng wrote:
>
>> Hello Achim:
>> Sorry for another question about the model g1 in the last mail.
>>
>> As to model g2 and g3:
>> g2 <- glm(Freq ~ (age + drug + case)^2, data = df, family = poisson)
>> g3 <- glm(Freq ~ age * drug * case, data = df, family = poisson)
>> anova(g2, g3, test = "Chisq")
>>
>> I know clearly that the only difference between g2 and g3 is that g2 has
>> no 3-way interaction while g3 has,and anova tests whether this "only
>> difference(i.e. 3-way interaction)" is significant or not.
>>
>> But as to g1 and g3:
>> g1 <- glm(Freq ~ age/(drug + case), data = df, family = poisson)
>>
>> I can't find out the "only difference" between g1 and g3,so I don't know
>> what "anova(g1, g3, test = "Chisq")" tests for. Also, what "/" sign
>> following age in g1 refers to?
>
>The "/" could be replaced by a "*" here and the fitted values and
>corresponding log-likelihood would not change. Only the coefficients
>change: "/" induces a nested coding while "*" employs the interaction
>coding.
>
>Breaking everything down to main and interaction effects (and ignoring the
>particular coding of the coefficients), the three models are
>
>g1: a + d + c + a:d + a:c
>g2: a + d + c + a:d + a:c + d:c
>g3: a + d + c + a:d + a:c + d:c + a:d:c
>
>with interpretations:
>
>g1: conditional independence of drug and case given age
>g2: no three-way interaction (case depends on drug but in the same way for
>different levels of age)
>g3: saturated model
>
>>
>> Many thanks and sorry for many quesions.
>>
>>
>> Best
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> At 2013-04-24 22:22:55,"Achim Zeileis" wrote:
>>> On Wed, 24 Apr 2013, meng wrote:
>>>
>>>> Hi,Achim:
>>>> Can all the analysis you mentioned via loglm be performed via
>>>> glm(...,family=poisson)?
>>>
>>> Yes.
>>>
>>> ## transform table back to data.frame
>>> df <- as.data.frame(tab)
>>>
>>> ## fit models: conditional independence, no-three way interaction,
>>> ## and saturated
>>> g1 <- glm(Freq ~ age/(drug + case), data = df, family = poisson)
>>> g2 <- glm(Freq ~ (age + drug + case)^2, data = df, family = poisson)
>>> g3 <- glm(Freq ~ age * drug * case, data = df, family = poisson)
>>>
>>> ## likelihood-ratio tests (against saturated)
>>> anova(g1, g3, test = "Chisq")
>>> anova(g2, g3, test = "Chisq")
>>>
>>> ## compare fitted frequencies (which are essentially equal)
>>> all.equal(as.numeric(fitted(g1)),
>>> as.data.frame(as.table(fitted(m1)))$Freq)
>>> all.equal(as.numeric(fitted(g2)),
>>> as.data.frame(as.table(fitted(m2)))$Freq)
>>>
>>> The difference is mainly that loglm() has a specialized user interface and
>>> that it uses a different optimizer (iterative proportional fitting rather
>>> than iterative reweighted least squares).
>>>
>>> Best,
>>> Z
>>>
>>>> Many thanks.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2013-04-24 19:37:10,"Achim Zeileis" wrote:
>>>>> On Wed, 24 Apr 2013, meng wrote:
>>>>>
>>>>>> Hi all:
>>>>>> For stratified count data,how to perform regression analysis?
>>>>>>
>>>>>> My data:
>>>>>> age case oc count
>>>>>> 1 1 1 21
>>>>>> 1 1 2 26
>>>>>> 1 2 1 17
>>>>>> 1 2 2 59
>>>>>> 2 1 1 18
>>>>>> 2 1 2 88
>>>>>> 2 2 1 7
>>>>>> 2 2 2 95
>>>>>>
>>>>>> age:
>>>>>> 1:<40y
>>>>>> 2:>40y
>>>>>>
>>>>>> case:
>>>>>> 1:patient
>>>>>> 2:health
>>>>>>
>>>>>> oc:
>>>>>> 1:use drug
>>>>>> 2:not use drug
>>>>>>
>>>>>> My purpose:
>>>>>> Anaysis whether case and oc are correlated, and age is a stratified varia
>>>> ble.
>>>>>>
>>>>>> My solution:
>>>>>> 1,Mantel-Haenszel test by using function "mantelhaen.test"
>>>>>> 2,loglinear regression by using function glm(count~case*oc,family=poisson
>>>> ).But I don't know how to handle variable "age",which is the stratified vari
>>>> able.
>>>>>
>>>>> Instead of using glm(family = poisson) it is also convenient to use
>>>>> loglm() from package MASS for the associated convenience table.
>>>>>
>>>>> The code below shows how to set up the contingency table, visualize it
>>>>> using package vcd, and then fit two models using loglm. The models
>>>>> considered are conditional independence of case and drug given age and the
>>>>> "no three-way interaction" already suggested by Peter. Both models are
>>>>> also accompanied by visualizations of the residuals.
>>>>>
>>>>> ## contingency table with nice labels
>>>>> tab <- expand.grid(drug = 1:2, case = 1:2, age = 1:2)
>>>>> tab$count <- c(21, 26, 17, 59, 18, 88, 7, 95)
>>>>> tab$age <- factor(tab$age, levels = 1:2, labels = c("<40", ">40"))
>>>>> tab$case <- factor(tab$case, levels = 1:2, labels = c("patient",
>>>>> "healthy"))
>>>>> tab$drug <- factor(tab$drug, levels = 1:2, labels = c("yes", "no"))
>>>>> tab <- xtabs(count ~ age + drug + case, data = tab)
>>>>>
>>>>> ## visualize case explained by drug given age
>>>>> library("vcd")
>>>>> mosaic(case ~ drug | age, data = tab,
>>>>> split_vertical = c(TRUE, TRUE, FALSE))
>>>>>
>>>>> ## test wheter drug and case are independent given age
>>>>> m1 <- loglm(~ age/(drug + case), data = tab)
>>>>> m1
>>>>>
>>>>> ## visualize corresponding residuals from independence model
>>>>> mosaic(case ~ drug | age, data = tab,
>>>>> split_vertical = c(TRUE, TRUE, FALSE),
>>>>> residuals_type = "deviance",
>>>>> gp = shading_hcl, gp_args = list(interpolate = 1.2))
>>>>> mosaic(case ~ drug | age, data = tab,
>>>>> split_vertical = c(TRUE, TRUE, FALSE),
>>>>> residuals_type = "pearson",
>>>>> gp = shading_hcl, gp_args = list(interpolate = 1.2))
>>>>>
>>>>> ## test whether there is no three-way interaction
>>>>> ## (i.e., dependence of case on drug is the same for both age groups)
>>>>> m2 <- loglm(~ (age + drug + case)^2, data = tab)
>>>>> m2
>>>>>
>>>>> ## visualization (with default pearson residuals)
>>>>> mosaic(case ~ drug | age, data = tab,
>>>>> expected = ~ (age + drug + case)^2,
>>>>> split_vertical = c(TRUE, TRUE, FALSE),
>>>>> gp = shading_hcl, gp_args = list(interpolate = 1.2))
>>>>>
>>>>>
>>>>>> Many thanks for your help.
>>>>>>
>>>>>> My best.
>>>>>> [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help@r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.h
>>>> tml
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>
>>>>
>>>>
>>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
[[alternative HTML version deleted]]