[R] Different deviance residuals in a (similar?!?) glm example
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Feb 28 14:41:36 CET 2006
This is a Poisson regression. You cannot just multiply counts by 10 and
have a valid sample from a Poisson distribution with 10x the mean. So the
example (and the calculations below) make zero statistical sense.
For a poisson() family,
$dev.resids
function (y, mu, wt)
2 * wt * (y * log(ifelse(y == 0, 1, y/mu)) - (y - mu))
So if you multiply the counts by 10 and change the offset to multiply the
mu by 10, (y - log(y/mu) - (y - mu)) is multiplied by 10. Hence the
deviance residuals are scaled by sqrt(10), and res1/res is sqrt(10).
> plot(const.a, const.b, pch=16)
> lines(const.a, 1/sqrt(const.a))
indicates a perfect fit.
On Mon, 27 Feb 2006, Camarda, Carlo Giovanni wrote:
> Dear R-users,
> I would like to show you a simple example that gives an overview of one
> of my current issue.
> Although my working setting implies a different parametric model (which
> cannot be framed in the glm), I guess that what I'll get from the
> following example it would help for the next steps.
> Anyway here it is.
> Firstly I simulated from a series of exposures, a series of deaths
> (given a model, Gompertz, and a probability distribution, Poisson). Then
> a multiply both deaths and exposures by a constant. Finally I fitted
> with glm the simulated data sets and I compared the deviance residuals.
> They're different by a certain constant, but I could not get a
> meaningful relationship between the used constants (a scatter plot of
> them is given at the end).
> The following example tried to be as completed as possible and I hope
> that it will be clearer than my previous words.
> Thanks,
> Carlo Giovanni Camarda
>
> #### simulation procedure
> age <- 50:100 # time range
> # parameters
> alpha.sim <- 0.00005
> beta.sim <- 0.1
> # simulated hazard from a Gompertz
> hazard.sim <- alpha.sim*(exp(beta.sim*age))
> # first dataset
> exposures <- c(4748, 4461, 4473, 4326, 4259, 4219, 4049, 3965, 3801,
> 3818, 3670, 3521, 3537,
> 3482, 3347, 3180, 3100, 2890, 2755, 2622, 2530, 2502,
> 2293, 2216, 1986, 1875,
> 1693, 1550, 1395, 1295, 1104, 952, 792, 755, 726, 608,
> 523, 419, 338,
> 246, 205, 148, 112, 75, 52, 32, 23, 15, 7, 6, 3) # just
> as example
> deaths.sim <- rpois(length(age), exposures*hazard.sim) # simulating
> deaths from a poisson distribution (Brillinger, 1986)
> my.offset <- log(exposures) # offset for the poisson regression
> # new dataset: decupleing the sample size
> deaths.sim1 <- deaths.sim * 10
> exposures1 <- exposures * 10
> my.offset1 <- log(exposures1)
> # fitting the first dataset
> fit <- glm(formula = deaths.sim ~ age + offset(my.offset),
> family = poisson(link = "log"))
> res <- residuals(fit, type="deviance")
> # fitting the new dataset
> fit1 <- glm(formula = deaths.sim1 ~ age + offset(my.offset1),
> family = poisson(link = "log"))
> res1 <- residuals(fit1, type="deviance")
> # estimating hazard
> hazard.act <- deaths.sim/exposures # == deaths.sim1/exposures1
> hazard.fit <- predict(fit, type="response") / exposures
> hazard.fit1 <- predict(fit1, type="response") / exposures1
> # plotting log-hazard
> plot(age, log(hazard.sim), cex=1.2)
> lines(age, log(hazard.act), col="red", lwd=2)
> lines(age, log(hazard.fit), col="blue", lwd=2, lty=2)
> lines(age, log(hazard.fit1), col="green", lwd=2, lty=3)
> all(round(hazard.fit,5)==round(hazard.fit1,5)) ## TRUE
> # plotting residuals
> plot(age, res, cex=1.2, lwd=2, ylim=range(res, res1))
> points(age, res1, col="red", cex=1.2, lwd=2, pch=2)
> all(round(res,5)==round(res1,5)) ## FALSE
> # looking at the ratio
> ratio <- (res/res1)[1]
> ratio
> all(round(res,10)==round(res1*ratio,10))
> # here the scatterplot resid-ratios vs multiplier-constant
> const.a <- c(1:16) # constants which I tried to multiply by
> const.b <-
> c(1,0.7071068,0.5773503,0.5,0.4472136,0.4082483,0.3779645,0.3535534, #
> ratio between the deviance residuals
>
> 0.3333333,0.3162278,0.3015113,0.2886751,0.2773501,0.2672612,0.2581989,0.
> 25)
> plot(const.a, const.b, pch=16)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list