[R] exponential distribution

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Jun 9 07:42:55 CEST 2008


What do you mean by 'correct' here?  If you do manually what glm() does, 
you would expect to get the same answer, but that is not independent 
verification.

As far as I recall either glm nor survreg are calculation the exact 
variance in this case -- they are both making use of asymptotic theory. 
glm makes use of the expected (Fisher) information evaluated at the MLE, 
survreg of the observed information -- those are asymptotically equivalent 
but there is some evidence preferring the observed information.

As I hinted, I've not checked this for several years and I will leave it 
to you to study the documentation and code.

On Sun, 8 Jun 2008, Antonio Gasparrini wrote:

> Dear all,
>
> I've tried to solve the Es. 12, cap 4 of "Introduction to GLM" by 
> Annette Dobson. It's about the relationship between survival time of 
> leukemia patients and blood cell count. I tried to fit a model with 
> exponential distribution, first by glm (family gamma and then dispersion 
> parameter fixed to 1) and then with survreg. They gave me the same point 
> estimates but the standard errors are slightly different. I checked the 
> results building the routine manually, and it seems that the glm results 
> are the correct one.
>
> #######################################################
> data <- data.frame(y=c(65,156,100,134,16,108,121,4,39,143,56,26,22,1,1,5,65),
> x=c(3.36,2.88,3.63,3.41,3.78,4.02,4.00,4.23,3.73,3.85,3.97,4.51,
> 4.54,5.00,5.00,4.72,5.00))
>
> model1 <- glm(y~x,family=Gamma(link="log"),data)
> summary(model1,dispersion=1)
> model2 <- survreg(Surv(y) ~ x, data, dist="exponential")
> summary(model2)
>
> X <- model.matrix(model1)
> y <- as.vector(data$y)
> b <- as.matrix(c(1,1))  # STARTING VALUES
> iter <- matrix(0,7,2)
> iter[1,] <- b
> for (i in 2:7) {
> W <- diag(rep(1,length(y)))
> z <- X%*%b + (y-exp(X%*%b))*1/exp(X%*%b)
> b <- solve(t(X)%*%W%*%X) %*% (t(X)%*%W%*%z)
> iter[i,] <- b
> }
> summary(model1,dispersion=1)$coef
> summary(model2)
> iter[nrow(temp),]
> sqrt(diag(solve(t(X)%*%W%*%X)))
> #######################################################
>
> Can you explain if this difference is due to an error?
> Thanks in advance
>
>
> Antonio Gasparrini
> Public and Environmental Health Research Unit (PEHRU)
> London School of Hygiene & Tropical Medicine
> Keppel Street, London WC1E 7HT, UK
> Office: 0044 (0)20 79272406 - Mobile: 0044 (0)79 64925523
> http://www.lshtm.ac.uk/people/gasparrini.antonio ( http://www.lshtm.ac.uk/pehru/ )

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list