[R] fixed effects
Liaw, Andy
andy_liaw at merck.com
Tue Mar 28 04:46:17 CEST 2006
I guess you meant X has 20,000 levels (and 40 observations each)? In that
case lm() will attempt to create a design matrix that's 8e5 by 2e4, and
that's unlikely to fit in the RAM.
It is very easy to compute by hand. I'm using a smaller data size first to
check the result against summary(lm()), then the size you specified
(hopefully with more correct arithmetics...) to show that it's quite doable
even on a modest laptop:
> set.seed(1)
> y <- rnorm(80)
> x <- factor(rep(paste("x", 1:20, sep=""), 4))
> summary(lm(y ~ x))$r.squared
[1] 0.2555885
> rsq <- function(x, y) {
+ sse <- sum(ave(y, x, FUN=function(x) x - mean(x))^2)
+ sstotal <- var(y) * (length(y) - 1)
+ 1 - sse / sstotal
+ }
> rsq(x, y)
[1] 0.2555885
> set.seed(1)
> y <- rnorm(8e5)
> x <- factor(rep(paste("x", 1:2e4, sep=""), 40))
> system.time(rsq(x, y))
[1] 1.99 0.03 2.06 NA NA
Andy
From: ivo welch
>
> dear R wizards:
>
> X is factor with 20,000*20=800,000 observations of 20,000 factors.
> I.e., each factor has 20 observations. y is 800,000 normally
> distributed data points. I want to see how much R^2 the X
> factors can provide. Easy, right?
>
> > lm ( y ~ X)
> and
> > aov( y ~ X)
> Error: cannot allocate vector of size 3125000 Kb
>
> is this computationally infeasible? (I am not an expert, but
> off-hand, I thought this can work as long as the X's are just
> factors = fitted means.)
>
> > help.search("fixed.effects");
>
> fixed.effects(nlme) Extract Fixed Effects
> fixed.effects.lmList(nlme) Extract lmList Fixed Effects
> lme(nlme) Linear Mixed-Effects Models
> lmeStruct(nlme) Linear Mixed-Effects Structure
> nlme(nlme) Nonlinear Mixed-Effects Models
> nlmeStruct(nlme) Nonlinear Mixed-Effects Structure
>
> Type 'help(FOO, package = PKG)' to inspect entry 'FOO(PKG) TITLE'.
>
> ok---I want to read the fixed.effects help. could the help
> system tell me how to inspect these entries?
>
> > help("fixed.effects")
> wrong
> > help.search("fixed.effects")
> two entries, as above. nothing new.
> > ?fixed.effects
> wrong
>
> eventually, it dawned on me that nlme in parens was not a
> function argument, but the name of the package within which
> fixed.effects lives. Suggestion: maybe a different notation
> to name packages would be more intuitive in the help system.
> yes, I know it now, but other novices may not. even a colon
> instead of a () may be more intuitive in this context.
>
> > library(nlme); ?lme
> and then
> > lme(y ~ X)
> Error in getGroups.data.frame(dataMix, groups) :
> Invalid formula for groups
>
>
> now I have to beg for some help. ok, blatant free-riding.
> the lme docs tells me it does the Laird and Ware model, but I
> do not know this model. the only two examples given at the
> end of the lme help file seem to be similar to what I just
> specified. so, how do I execute a simple fixed-effects
> model? (later, I may want to add a variable Z that is a
> continuous random variable.) could someone please give me
> one quick example? help is, as always, highly appreciated.
>
> sincerely,
>
> /ivo welch
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
More information about the R-help
mailing list