[R] fixed effects

Liaw, Andy andy_liaw at merck.com
Tue Mar 28 05:01:45 CEST 2006


Hmm... didn't read the whole post before I hit `send'...

I think you basically will have to fit the model `by hand', which is not
that hard, given the simple structure of the model(s).  The formulae for the
quantities of interests are quite straightforward and easy to code in R
(similar to the rsq function I showed).  Adding a continuous variable to the
model simply requires regressing the residuals; i.e., ave(y, x, function(x)
x - mean(x)), on the new variable z (easy for lm()), and take one more
degree of freedom from SS(total).

Andy

From: Liaw, Andy
> 
> I guess you meant X has 20,000 levels (and 40 observations 
> each)?  In that case lm() will attempt to create a design 
> matrix that's 8e5 by 2e4, and that's unlikely to fit in the RAM.
> 
> It is very easy to compute by hand.  I'm using a smaller data 
> size first to check the result against summary(lm()), then 
> the size you specified (hopefully with more correct 
> arithmetics...) to show that it's quite doable even on a 
> modest laptop:
> 
> > set.seed(1)
> > y <- rnorm(80)
> > x <- factor(rep(paste("x", 1:20, sep=""), 4))
> > summary(lm(y ~ x))$r.squared
> [1] 0.2555885
> > rsq <- function(x, y) {
> +     sse <- sum(ave(y, x, FUN=function(x) x - mean(x))^2)
> +     sstotal <- var(y) * (length(y) - 1)
> +     1 - sse / sstotal
> + }
> > rsq(x, y)
> [1] 0.2555885
> > set.seed(1)
> > y <- rnorm(8e5)
> > x <- factor(rep(paste("x", 1:2e4, sep=""), 40)) 
> system.time(rsq(x, y))
> [1] 1.99 0.03 2.06   NA   NA
> 
> Andy
>  
> 
> From: ivo welch
> > 
> > dear R wizards:
> > 
> > X is factor with 20,000*20=800,000 observations of 20,000 factors.
> > I.e., each factor has 20 observations.  y is 800,000 normally 
> > distributed data points.  I want to see how much R^2 the X 
> > factors can provide.  Easy, right?
> > 
> > > lm ( y ~ X)
> > and
> > >  aov( y ~ X)
> > Error: cannot allocate vector of size 3125000 Kb
> > 
> > is this computationally infeasible?  (I am not an expert, but
> > off-hand, I thought this can work as long as the X's are just 
> > factors = fitted means.)
> > 
> > > help.search("fixed.effects");
> > 
> > fixed.effects(nlme)                 Extract Fixed Effects
> > fixed.effects.lmList(nlme)          Extract lmList Fixed Effects
> > lme(nlme)                           Linear Mixed-Effects Models
> > lmeStruct(nlme)                     Linear Mixed-Effects Structure
> > nlme(nlme)                          Nonlinear Mixed-Effects Models
> > nlmeStruct(nlme)                    Nonlinear Mixed-Effects 
> Structure
> > 
> > Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'.
> > 
> > ok---I want to read the fixed.effects help.  could the help
> > system tell me how to inspect these entries?
> > 
> > > help("fixed.effects")
> > wrong
> > > help.search("fixed.effects")
> > two entries, as above.  nothing new.
> > > ?fixed.effects
> > wrong
> > 
> > eventually, it dawned on me that nlme in parens was not a
> > function argument, but the name of the package within which 
> > fixed.effects lives.  Suggestion:  maybe a different notation 
> > to name packages would be more intuitive in the help system.  
> > yes, I know it now, but other novices may not.  even a colon 
> > instead of a () may be more intuitive in this context.
> > 
> > > library(nlme); ?lme
> > and then
> > > lme(y ~ X)
> > Error in getGroups.data.frame(dataMix, groups) :
> >         Invalid formula for groups
> > 
> > 
> > now I have to beg for some help.  ok, blatant free-riding.
> > the lme docs tells me it does the Laird and Ware model, but I 
> > do not know this model.  the only two examples given at the 
> > end of the lme help file seem to be similar to what I just 
> > specified.  so, how do I execute a simple fixed-effects 
> > model?  (later, I may want to add a variable Z that is a 
> > continuous random variable.)  could someone please give me 
> > one quick example?  help is, as always, highly appreciated.
> > 
> > sincerely,
> > 
> > /ivo welch
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> --------------------------------------------------------------
> ----------------
> Notice:  This e-mail message, together with any attachments, 
> contains information of Merck & Co., Inc. (One Merck Drive, 
> Whitehouse Station, New Jersey, USA 08889), and/or its 
> affiliates (which may be known outside the United States as 
> Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as 
> Banyu) that may be confidential, proprietary copyrighted 
> and/or legally privileged. It is intended solely for the use 
> of the individual or entity named on this message.  If you 
> are not the intended recipient, and have received this 
> message in error, please notify us immediately by reply 
> e-mail and then delete it from your system.
> --------------------------------------------------------------
> ----------------
>




More information about the R-help mailing list