# [R] fixed effects

Liaw, Andy andy_liaw at merck.com
Tue Mar 28 04:46:17 CEST 2006

```I guess you meant X has 20,000 levels (and 40 observations each)?  In that
case lm() will attempt to create a design matrix that's 8e5 by 2e4, and
that's unlikely to fit in the RAM.

It is very easy to compute by hand.  I'm using a smaller data size first to
check the result against summary(lm()), then the size you specified
(hopefully with more correct arithmetics...) to show that it's quite doable
even on a modest laptop:

> set.seed(1)
> y <- rnorm(80)
> x <- factor(rep(paste("x", 1:20, sep=""), 4))
> summary(lm(y ~ x))\$r.squared
[1] 0.2555885
> rsq <- function(x, y) {
+     sse <- sum(ave(y, x, FUN=function(x) x - mean(x))^2)
+     sstotal <- var(y) * (length(y) - 1)
+     1 - sse / sstotal
+ }
> rsq(x, y)
[1] 0.2555885
> set.seed(1)
> y <- rnorm(8e5)
> x <- factor(rep(paste("x", 1:2e4, sep=""), 40))
> system.time(rsq(x, y))
[1] 1.99 0.03 2.06   NA   NA

Andy

From: ivo welch
>
> dear R wizards:
>
> X is factor with 20,000*20=800,000 observations of 20,000 factors.
> I.e., each factor has 20 observations.  y is 800,000 normally
> distributed data points.  I want to see how much R^2 the X
> factors can provide.  Easy, right?
>
> > lm ( y ~ X)
> and
> >  aov( y ~ X)
> Error: cannot allocate vector of size 3125000 Kb
>
> is this computationally infeasible?  (I am not an expert, but
> off-hand, I thought this can work as long as the X's are just
> factors = fitted means.)
>
> > help.search("fixed.effects");
>
> fixed.effects(nlme)                 Extract Fixed Effects
> fixed.effects.lmList(nlme)          Extract lmList Fixed Effects
> lme(nlme)                           Linear Mixed-Effects Models
> lmeStruct(nlme)                     Linear Mixed-Effects Structure
> nlme(nlme)                          Nonlinear Mixed-Effects Models
> nlmeStruct(nlme)                    Nonlinear Mixed-Effects Structure
>
> Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'.
>
> ok---I want to read the fixed.effects help.  could the help
> system tell me how to inspect these entries?
>
> > help("fixed.effects")
> wrong
> > help.search("fixed.effects")
> two entries, as above.  nothing new.
> > ?fixed.effects
> wrong
>
> eventually, it dawned on me that nlme in parens was not a
> function argument, but the name of the package within which
> fixed.effects lives.  Suggestion:  maybe a different notation
> to name packages would be more intuitive in the help system.
> yes, I know it now, but other novices may not.  even a colon
> instead of a () may be more intuitive in this context.
>
> > library(nlme); ?lme
> and then
> > lme(y ~ X)
> Error in getGroups.data.frame(dataMix, groups) :
>         Invalid formula for groups
>
>
> now I have to beg for some help.  ok, blatant free-riding.
> the lme docs tells me it does the Laird and Ware model, but I
> do not know this model.  the only two examples given at the
> end of the lme help file seem to be similar to what I just
> specified.  so, how do I execute a simple fixed-effects
> model?  (later, I may want to add a variable Z that is a
> continuous random variable.)  could someone please give me
> one quick example?  help is, as always, highly appreciated.
>
> sincerely,
>
> /ivo welch
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help