[R] inverse prediction and Poisson regression
Spencer Graves
spencer.graves at pdf.com
Fri Jul 25 21:56:32 CEST 2003
Dear Peter:
Thanks very much. I said I knew there were better ways, but none
that I could develop in the time available this morning. You've helped
introduce me to other options.
Best Wishes,
Spencer Graves
Peter Dalgaard BSA wrote:
> Spencer Graves <spencer.graves at pdf.com> writes:
>
>
>>The problem with nls is that it is NOT maximum likelihood for the
>>Poisson distribution. For the Poisson, the standard deviation is the
>>square root of the mean, while nls assumes constant standard
>>deviation. That's why I stayed with glm. The answers should be
>>comparable, but I would prefer the glm answers. spencer graves
>
>
> That's why I was suggesting either a variance stabilizing
> transformation or using gnls() with a weights function. In both cases
> you need to watch out for scaling of SE's stemming from the fact that
> the variance is supposed to be known in the Poisson distribution, but
> the fitting routines estimate a sigma from the residuals anyway.
>
> I.e. for instance:
>
>
>>Phyto.nls2 <- nls(sqrt(y+.5) ~ sqrt(Ymax/(1 +
>>x/x50)+.5),data=Phytopath,start=list(Ymax=20.0,x50=0.01),trace=T)
>
> 9.400602 : 20.00 0.01
> 0.9064723 : 27.21511020 0.03593643
> 0.06338235 : 28.21654101 0.05995647
> 0.02616589 : 28.50221759 0.06815302
> 0.02608587 : 28.54871243 0.06835046
> 0.02608586 : 28.54960242 0.06834083
> 0.02608586 : 28.5495605 0.0683413
>
>>summary(Phyto.nls2)
>
>
> Formula: sqrt(y + 0.5) ~ sqrt(Ymax/(1 + x/x50) + 0.5)
>
> Parameters:
> Estimate Std. Error t value Pr(>|t|)
> Ymax 28.54956 1.65113 17.291 0.0368 *
> x50 0.06834 0.01264 5.407 0.1164
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> Residual standard error: 0.1615 on 1 degrees of freedom
>
> Correlation of Parameter Estimates:
> Ymax
> x50 -0.6833
>
> but here you need to know that the theoretical sd is 0.5, so the Std.
> errors all need multiplication by 0.5/0.1615.
>
> Using gnls() we'd get (if I got the calling sequence right...)
>
>
>>Phyto.gnls <- gnls(y ~ Ymax/(1 + x/x50),
>
> + data=Phytopath,weights=varPower(fixed=.5),start=list(Ymax=20.0,x50=0.01))
>
>>summary(Phyto.gnls)
>
> Generalized nonlinear least squares fit
> Model: y ~ Ymax/(1 + x/x50)
> Data: Phytopath
> AIC BIC logLik
> 13.71292 11.00875 -3.856458
>
> Variance function:
> Structure: Power of variance covariate
> Formula: ~fitted(.)
> Parameter estimates:
> power
> 0.5
>
> Coefficients:
> Value Std.Error t-value p-value
> Ymax 29.393796 2.4828723 11.838626 0.0536
> x50 0.063028 0.0125244 5.032376 0.1249
>
> Correlation:
> Ymax
> x50 -0.84
>
> Standardized residuals:
> [1] -0.4894266 0.7621749 -0.4237346
> attr(,"std")
> [1] 2.8478142 1.4239071 0.8586483
> attr(,"label")
> [1] "Standardized residuals"
>
> Residual standard error: 0.6367906
> Degrees of freedom: 3 total; 1 residual
>
> and again, it is necessary to adjust the SE's by multiplying with
> 1/0.6368
>
> It is in fact rather easy to do direct MLE for this kind of model:
>
>
>>with(Phytopath,
>>optim(c(28,.7),fn=function(p){Ymax<-p[1];x50<-p[2];
>
> + -sum(dpois(y,lambda=Ymax/(1
> + + x/x50),log=TRUE))}, method="BFGS", hessian=T))
> $par
> [1] 28.55963487 0.06833524
>
> $value
> [1] 7.21251
>
> $counts
> function gradient
> 42 13
>
> $convergence
> [1] 0
>
> $message
> NULL
>
> $hessian
> [,1] [,2]
> [1,] 0.07356072 6.631868
> [2,] 6.63186764 1294.792539
>
> Warning message:
> NaNs produced in: dpois(x, lambda, log)
>
> and we can get SE's from the inverse hessian with
>
>
>>sqrt(diag(solve(with(Phytopath,
>
> + optim(c(28,.7),fn=function(p){Ymax<-p[1];x50<-p[2];
> + -sum(dpois(y,lambda=Ymax/(1
> + + x/x50),log=TRUE))}, method="BFGS", hessian=T))$hessian)))
> [1] 5.02565893 0.03788052
>
> Notice that the variance stabilizing transform seems to do rather
> better than gnls() compared to true MLE. I'm slightly puzzled by this,
> but presumably it has to do with the fact that MLE for a Gaussian
> model with a mean/variance relationship is *not* identical to the
> iteratively reweighted least squares that glm() et al. are doing. (I
> wouldn't quite rule out that I've simply made a mistake somewhere,
> though...)
>
More information about the R-help
mailing list