[R] Improving function that estimates regressions for all variables specified

Jorge Cimentada cimentadaj at gmail.com
Fri Aug 26 20:11:15 CEST 2016

Hi, I'd like some feedback on how to make this function more "quicker and

I normally run several regressions like this:
y ~ x1
y ~ x1 + x2
y ~ x1 + x2 +xn

Instead, I created a function in which I specify y, x1 and x2 and the
function automatically generates:
y ~ x1
y ~ x1 + x2
y ~ x1 + x2 +xn

This is the function:

models <- function(dv, covariates, data) {
    dv <- paste(dv, "~ 1")
    combinations <- lapply(1:length(covariates), function(i) seq(1:i))
    formulas <- lapply(combinations, function(p) x <-
as.formula(paste(c(dv, covariates[p]), collapse=" + ")))
    results <- lapply(formulas, function(o) lm(o, data=data))

And an example:
models("mpg",c("cyl","disp","hp","am"), mtcars)

I'm concerned about the time that it takes when using other regression
models, such as those with the survey package(I know these models are heavy
and take time) but I'm sure that the function has room for improvement.

I'd also like to specify the variables as a formula. I managed to do it but
I get different results when using things like scale() for predictors.

Formula version of the function:
models2 <- function(formula, data) {
    dv <- paste(all.vars(formula)[1], " ~ 1")
    covariates <- all.vars(formula)[-1]
    combinations <- lapply(1:length(covariates), function(i) seq(1:i))
    lfo <- lapply(combinations, function(p) x <- as.formula(paste(c(dv,
covariates[p]), collapse=" + ")))
    results <- lapply(lfo, function(o) lm(o, data=data))

models("mpg",c("cyl","scale(disp)"), mtcars)

models2(mpg ~ cyl + scale(disp), mtcars)

See the difference between the disp variables?

Any feedback is appreciated!

*Jorge Cimentada*
*Ph.D. Candidate*
Dpt. Ciències Polítiques i Socials
Ramon Trias Fargas, 25-27 | 08005 Barcelona

Office 24.331
[Tel.] 697 382 009www.jorgecimentada.com

	[[alternative HTML version deleted]]

More information about the R-help mailing list