[R] loops over regression models

Xu Jun junxu.r at gmail.com
Fri Jan 13 04:01:34 CET 2012


Dear R help listers,

I am trying to replicate results in Gelman and Hill's book (Chapter 3
in regressions and multilevel models). Below I estimated two models
(chp3.1 and chp3.3 in R codes) with the same data and dependent
variable but different independent variables. I have been using Stata
for quite a while, and I know I can use foreach to build a loop to
condense the codes (especially if I have a large number of models to
run).

In Stata, it would be something like:

****************************************************
// read in data
use kidiq, clear

// run two regression
reg kid_score mom_hs
reg kid_score mom_iq

// the next three lines are equivalent of the previous two lines
foreach var in mom_hs mom_iq {
 reg kid_score `var'
}
***************************************************


So I want to figure out how to use R to do this. Below are my codes:

####################################################
library(foreign)
# read in stata data file
kidiq <-data.frame(read.dta('kidiq.dta', convert.factor=FALSE))

# bivariate regressions
chp3.1 <- lm(kid_score ~ mom_hs, data=kidiq)
summary(chp3.1)

chp3.3 <- lm(kid_score ~ mom_iq, data=kidiq)
summary(chp3.3)

clist <- c("mom_iq", "mom_hs")

for (x in clist) {
  lm(kid_score ~ x, data = kidiq)

}
Error in model.frame.default(formula = kid_score ~ x, data = kidiq,
drop.unused.levels = TRUE) :
  variable lengths differ (found for 'x')
##################################################

But I got an error message that says variable length differ. I tried
various ways to work around this, for example, I tried:

clist <- c("mom_iq", "mom_hs")

for (x in 1:length(clist)) {
  lm(kid_score ~ clist[x], data = kidiq)

}



 But none of these work. So I am wondering if anyone could give me
some hint. Thanks a lot

Jun Xu, PhD
Assistant Professor
Department of Sociology
Ball State University
Muncie, IN



More information about the R-help mailing list