[R] esthetics --- extending the lm command to fixed effects?

Thomas Lumley tlumley at u.washington.edu
Thu May 20 17:30:52 CEST 2010


On Thu, 20 May 2010, ivo welch wrote:

> dear R wizards:
>
> not important.  more a curiosity or esthetics question.
>
> is there a way to extend the standard lm command, so that it takes a new
> argument that handles fixed effects?   right now, I have (provided to me
> from an expert---I would have never figured this one out):
>
>   diffid <- function(h,id) {
>       id <- as.factor(id)[, drop=TRUE]
>       apply(as.matrix(h), 2, function(x) x - tapply(x,id,mean)[id]
>   }

Simpler would be

    diffid<-function(h,id){ h-ave(h,id)}

> which is used as
>
>     r= lm( diffid(y, firmid) ~ diffid(x, firmid ) )
>
> it works, but it would be much nicer if I could just write
>
>    r= lm( y ~ x + z, fixed.effects=firmid )
>
> does this already exists as a package?  or has someone figured out how to
> program this?

I would just have used lm(y~x+z+factor(firmid)).  Admittedly, you get a whole bunch of uninteresting coefficients in the output, but it's not that hard to subset them out.

There are two implementation of this in Bill Venables' course notes on advanced programming. I think they are also in 'S Programming', but I can't find my copy right now.  These were motivated by computational problems: the full design matrix for the linear model was too large for memory at the time (last century).


As a final note, I would strongly discourage
    r= lm( y ~ x + z, fixed.effects=firmid )
as a specification, and would argue for
    r= lm( y ~ x + z, fixed.effects=~firmid )

I think the ability to have some subset of the arguments in a modelling call silently treated as formulas was a bad decision, although it must have looked user-friendly at the time.

          -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle



More information about the R-help mailing list