[R] lm() with many responses

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Apr 13 08:22:06 CEST 2005


On Tue, 12 Apr 2005, John Pitney wrote:

> I have one array of predictors, one observation per row, and one array of 
> responses, also arranged one observation per row.  I arrange these into a 
> data.frame and call lm() with a pasted-together formula.
>
> I would like to call lm() with a number of responses in excess of 100, but 
> for some reason, 39 seems to be a limit.  Why do I get an "invalid variable 
> names" error from model.frame() when supplying 40 or more responses?

Your expression is too long.  Create the response matrix and pass that to 
the formula, rather than passing an expression.

There is a 500-char internal limit on variable names in 
model.frame.default.  That should be enough ....

> As a workaround, I can loop through groups of 39 responses in separate 
> calls to lm(), but that seems inefficient and possibly version- or 
> platform-dependent.
>
> Here is my best effort at a minimal example showing the problem.

It's not easy to cut-and-paste, though.

> --- begin pasted R session ---
>> test.this <- function(n.resp, n.obs, n.pred) {
> + my.resp <- matrix(runif(n.resp * n.obs), nrow=n.obs)
> + my.resp.names <- paste("Response", 1:n.resp, sep=".")
> + my.pred <- matrix(runif(n.pred * n.obs), nrow=n.obs)
> + my.pred.names <- paste("Predictor", 1:n.pred, sep=".")
> + my.formula <- as.formula(paste("cbind(",
> +   paste(my.resp.names, collapse=", "), ") ~ ",
> +   paste(my.pred.names, collapse=" + ")))
> + d.tmp <- cbind(my.pred, my.resp)
> + d.tmp <- as.data.frame(d.tmp)
> + names(d.tmp) <- c(my.pred.names, my.resp.names)
> + my.lm <- lm(my.formula, data=d.tmp, model=F, qr=F, x=F, y=F,
> +   na.action=na.exclude)
> + my.lm
> + }
>> # Now, try it.  39 response vectors is OK, but 40 causes an error:
>> m1 <- test.this(40, 10, 2)
> Error in model.frame(formula, rownames, variables, varnames, extras, 
> extranames,  :
>        invalid variable names
>> m1 <- test.this(39, 10, 2)
>> # No error for n.resp == 39.
>> # Also, shouldn't "qr=F" in the call to lm() turn off output of m1$qr?

Only if it were implemented.

>> # m1$qr exists.  I'd like to save memory and omit it if possible.
>> str(m1$qr)
> List of 5
> $ qr   : num [1:10, 1:3] -3.162  0.316  0.316  0.316  0.316 ...
>  ..- attr(*, "dimnames")=List of 2
>  .. ..$ : chr [1:10] "1" "2" "3" "4" ...
>  .. ..$ : chr [1:3] "(Intercept)" "Predictor.1" "Predictor.2"
>  ..- attr(*, "assign")= int [1:3] 0 1 2
> $ qraux: num [1:3] 1.32 1.34 1.42
> $ pivot: int [1:3] 1 2 3
> $ tol  : num 1e-07
> $ rank : int 3
> - attr(*, "class")= chr "qr"
>> # Here's my version:
>> version
>         _
> platform i386-pc-mingw32
> arch     i386
> os       mingw32
> system   i386, mingw32
> status
> major    2
> minor    0.1
> year     2004
> month    11
> day      15
> language R
> --- end pasted R session ---
>
> Best regards,
> John
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list