[Rd] proposal: allowing alternative variance estimators in glm/lm
tlumley at u.washington.edu
Tue Jan 2 19:03:02 CET 2007
On Tue, 2 Jan 2007, ivo welch wrote:
> As a non-statistician user of R, maybe a hook functionality at strategic
> places could provide some flexibility without too much pain. I think
> replacing the standard output from summary.lm would be a bad idea (it
> could easily create errors downstream, when idiots like myself ask "why
> don't I get the s.e. that stata produces? duh---you loaded
> heteroskedasticity adjustment, but forgot about it). But I think some
> flexibility to add more information would be a very good thing.
> Hooks that can be set by functions (perhaps cascades) would allow third
> parties to create additional statistics, that could survive future
> changes to the functions themselves, without requiring a full object
> paradigm. For example, summary.lm could provide two hooks that allow
> programmers to chain my own objects to either the ans$coefficients and
> the ans object. (I guess even one hook would do.) Well-thought-out
> hooks could also add to print methods, etc., without requiring complete
> function rewrites, and would survive future changes in the real R code
In some sense this is what I was proposing -- I think that summary.lm and
predict.lm should get coefficients from coef() and variance matrices from
vcov() to provide hooks.
This should ideally happen automatically by inheritance, but for
historical reasons all the basic code was written for the most specialized
case (classical linear model) rather than the most general case.
>> From the perspective of a first-time amateurish end-user, an invokation
> of "library(lm.addnormalized)" could then magically always add a
> normalized coefficient to the coefficient output. An invokation of
> "library(lm.addheteroskedasticity)" could magically always add
> heteroskedasticity se's and T-stats. And so on.
I really think you want these properties to be part of the object, not
global options (and particularly not global options set by the library()
function). It should be possible to simultaneously have objects that
use the fisher-information standard errors and other objects that use
I would say that "using sandwich standard errors" is a property that
should be specified when the object is created (as it is for
survival::coxph() and gee::gee()).
"Displaying normalized coefficients" is a bit different. I don't think
you want to override what the coef() method produces, since that will mess
up fitted values and predictions. You would also need to override the
vcov() method, and that would cause other problems.
So in order to do what you want, the objects would need a printedcoef()
and printedSE() method in addition to the actual coef() and vcov(). These
would have to be called by a summary-like function. But even this isn't
the level of generality that you would want, since if you are doing this
work it would be nice to allow transformed coefficients (eg odds ratios
rather than log odds ratios for logistic regression), and you probably
then need a printedCI() method for confidence intervals. This would give
quite a useful modelsummary() function (provided it did something
sensible when the additional methods weren't present). The Design
package does things along these lines already, but its display functions
don't work on vanilla model objects.
In any case, I don't think this function should be called summary.lm(). My
proposal was just to allow predict.lm and summary.lm to be used in cases
where they already give the right sort of results and in addition to tidy
up the duplication of variance-matrix code.
More information about the R-devel