[R] APC Modelling and the GLM function

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Mon Mar 24 08:56:04 CET 2003

Sue_Paul at moh.govt.nz writes:

> Hi all
> Apologies for any cross posting.

[I've redirected my reply to r-help (not r-announce) and deleted
allstat since they have a different followup policy than we do.]

> I have encountered a rather bizarre "problem" in Splus and R. I am
> using Age-Period-Cohort models to model cervical cancer and have run
> the same data on both R (v.1.4.1 & v1.6.2) and Splus (version 6.0).
> I used the same command line in both Splus and R:
> glm(cases~-1+as.factor(age)
> +as.factor(period)+as.factor(cohort)+offset(log(person.years)),
> +family="poisson",data=mydata)
> While Splus and R fit APC models using different constraints, the
> fitted values should be identical. However, I have found the
> following: Both Splus and R models give you the same value for the
> residual deviance; If I use the function fitted.values on the glm
> object then both Splus AND R models returns the same number of cases
> (and hence the same incidence rate once you have divided by person
> years at risk); However, if I try to derive the fitted values
> "manually": i.e. fitted incidence rate = exp{
> age.effect+period.effect+cohort.effect} then I get a completely
> different set of fitted incidence rates. To do a quick check I also
> looked at second differences to see if these were identifiable, and
> found that the second differences for the age effects are consistent
> in both R and Splus. The period and cohort effects however, yield
> completely different second differences (in R & Splus). I guess this
> kind of narrows down the problem to the age and period effects,
> although I still cannot understand why glm would return the same
> deviance and fitted number of cases, if all the second differences
> and fitted rates were not identifiable. I am quite puzzled by this
> and can't seem to figure out what is going wrong. I would really
> appreciate any help that anyone can give me. Thanking you in advance

My guess is that you're being done in by contrast settings (removing
the intercept only affects the coding of the first term, AFAIR). Try
redoing the S-PLUS analysis using treatment contrasts, or the R
analysis using Helmert contrasts (are you sure you know how to
interpret the estimated coefficients when Helmert contrasts are being

   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907

More information about the R-help mailing list