# [R] fixed effects

Liaw, Andy andy_liaw at merck.com
Tue Mar 28 05:01:45 CEST 2006

```Hmm... didn't read the whole post before I hit `send'...

I think you basically will have to fit the model `by hand', which is not
that hard, given the simple structure of the model(s).  The formulae for the
quantities of interests are quite straightforward and easy to code in R
(similar to the rsq function I showed).  Adding a continuous variable to the
model simply requires regressing the residuals; i.e., ave(y, x, function(x)
x - mean(x)), on the new variable z (easy for lm()), and take one more
degree of freedom from SS(total).

Andy

From: Liaw, Andy
>
> I guess you meant X has 20,000 levels (and 40 observations
> each)?  In that case lm() will attempt to create a design
> matrix that's 8e5 by 2e4, and that's unlikely to fit in the RAM.
>
> It is very easy to compute by hand.  I'm using a smaller data
> size first to check the result against summary(lm()), then
> the size you specified (hopefully with more correct
> arithmetics...) to show that it's quite doable even on a
> modest laptop:
>
> > set.seed(1)
> > y <- rnorm(80)
> > x <- factor(rep(paste("x", 1:20, sep=""), 4))
> > summary(lm(y ~ x))\$r.squared
> [1] 0.2555885
> > rsq <- function(x, y) {
> +     sse <- sum(ave(y, x, FUN=function(x) x - mean(x))^2)
> +     sstotal <- var(y) * (length(y) - 1)
> +     1 - sse / sstotal
> + }
> > rsq(x, y)
> [1] 0.2555885
> > set.seed(1)
> > y <- rnorm(8e5)
> > x <- factor(rep(paste("x", 1:2e4, sep=""), 40))
> system.time(rsq(x, y))
> [1] 1.99 0.03 2.06   NA   NA
>
> Andy
>
>
> From: ivo welch
> >
> > dear R wizards:
> >
> > X is factor with 20,000*20=800,000 observations of 20,000 factors.
> > I.e., each factor has 20 observations.  y is 800,000 normally
> > distributed data points.  I want to see how much R^2 the X
> > factors can provide.  Easy, right?
> >
> > > lm ( y ~ X)
> > and
> > >  aov( y ~ X)
> > Error: cannot allocate vector of size 3125000 Kb
> >
> > is this computationally infeasible?  (I am not an expert, but
> > off-hand, I thought this can work as long as the X's are just
> > factors = fitted means.)
> >
> > > help.search("fixed.effects");
> >
> > fixed.effects(nlme)                 Extract Fixed Effects
> > fixed.effects.lmList(nlme)          Extract lmList Fixed Effects
> > lme(nlme)                           Linear Mixed-Effects Models
> > lmeStruct(nlme)                     Linear Mixed-Effects Structure
> > nlme(nlme)                          Nonlinear Mixed-Effects Models
> > nlmeStruct(nlme)                    Nonlinear Mixed-Effects
> Structure
> >
> > Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'.
> >
> > ok---I want to read the fixed.effects help.  could the help
> > system tell me how to inspect these entries?
> >
> > > help("fixed.effects")
> > wrong
> > > help.search("fixed.effects")
> > two entries, as above.  nothing new.
> > > ?fixed.effects
> > wrong
> >
> > eventually, it dawned on me that nlme in parens was not a
> > function argument, but the name of the package within which
> > fixed.effects lives.  Suggestion:  maybe a different notation
> > to name packages would be more intuitive in the help system.
> > yes, I know it now, but other novices may not.  even a colon
> > instead of a () may be more intuitive in this context.
> >
> > > library(nlme); ?lme
> > and then
> > > lme(y ~ X)
> > Error in getGroups.data.frame(dataMix, groups) :
> >         Invalid formula for groups
> >
> >
> > now I have to beg for some help.  ok, blatant free-riding.
> > the lme docs tells me it does the Laird and Ware model, but I
> > do not know this model.  the only two examples given at the
> > end of the lme help file seem to be similar to what I just
> > specified.  so, how do I execute a simple fixed-effects
> > model?  (later, I may want to add a variable Z that is a
> > continuous random variable.)  could someone please give me
> > one quick example?  help is, as always, highly appreciated.
> >
> > sincerely,
> >
> > /ivo welch
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > http://www.R-project.org/posting-guide.html
> >
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> http://www.R-project.org/posting-guide.html
>
> --------------------------------------------------------------
> ----------------
> Notice:  This e-mail message, together with any attachments,
> contains information of Merck & Co., Inc. (One Merck Drive,
> Whitehouse Station, New Jersey, USA 08889), and/or its
> affiliates (which may be known outside the United States as
> Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> Banyu) that may be confidential, proprietary copyrighted
> and/or legally privileged. It is intended solely for the use
> of the individual or entity named on this message.  If you
> are not the intended recipient, and have received this