[R] What PRECISELY is the dfbetas() or lm.influence()$coef ?
John Fox
jfox at mcmaster.ca
Thu Jun 12 22:58:54 CEST 2003
Dear Hormuzd,
At 01:24 PM 6/12/2003 -0400, Katki, Hormuzd (NIH/NCI) wrote:
> Hello. I want to get the proper influence function for the glm
>coefficients in R. This is supposed to be inv(information)*(y-yhat)*x. So
>I am wondering what is the exact mathematical formula for the output that
>the functions:
>
>dfbeta() OR lm.influence()$coefficients
>
>return for a glm model. I am confused because:
>
>1. Their columns don't sum to zero as influences should.
Even in a linear model, where the computation is exact, this isn't the
case, if influence is defined as the change in the coefficients upon
deleting each observation in turn (i.e., as dfbeta).
>2. They return different "influences", so the 2 functions are doing
>something different.
That's odd. I believe that dfbeta() for a GLM simply uses influence.glm,
which has the same $coefficients component as lm.influence. As such, for a
GLM, both are based on the last step of the IRLS fit -- i.e., a
linearization of the model.
>3. I think they divide each element by the standard error of the
>corresponding coefficient, but that's not enough to resolve any
>discrepancies
Perhaps you meant that dfbetas() [not dfbeta()] returns different values
from lm.influence()$coef (as in your subject line)? dfbetas standardizes
the coefficient changes by coefficient standard errors, using a deleted
estimate of the dispersion parameter.
>The documentation doesn't provide any details. Any help would be greatly
>appreciated.
I hope that this helps,
John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
More information about the R-help
mailing list