[R] Generating all possible models from full model

Daniel Malter daniel at umd.edu
Wed May 19 22:29:13 CEST 2010


Hi, one approach is document below. The function should work with any
regression function that follows the syntax of lm (others will need
adjustments). Note that you would have to create the interactions terms by
hand (which is no big deal if there are just few). Note also that this
approach can be highly problematic if you are scrounging for significant
relationships (this depends on the field and the specific intention with
which these analyses are performed).

#simulate data
  #predictor variables
data=data.frame(d=rnorm(100),e=rnorm(100),f=rnorm(100))
  #error term
u=rnorm(100)
  #dependent variable
y=data$d-data$e+2*data$f+u

#create a present/absent list for the regressors
grits=list()
for(i in 1:length(data)){
  grits[[i]]=c(0,1)
  }

#expand the above list to a grid that contains all combinations of
regressors
selection=expand.grid(grits)

#given the above grid, which regressor should I pick (get the indices for
which variable(s) should be included)
one=function(x){which(x==1)}
selection.id=apply(selection,1,one)

#what are the names of the included variables
vnames=function(x){names(data)[x]}
var.names=lapply(selection.id,vnames)

#Dependent variable (unnecessary step if y is a vector or matrix anyway)
y=as.matrix(y)

#Select the data for each regression and store them in a list
select.data=function(x){as.matrix(data[,x],row.names=T)}
Xs=lapply(selection.id,select.data)

#get the column names for each element of Xs right (workaround)
#this is necessary because R does not get the column names right if there is
only one column in the list element
for(i in 1:length(Xs)){dimnames(Xs[[i]])=list(NULL,var.names[[i]])}

#remove the first element because it's empty (otherwise the regression
function returns an error
#when it tries to run the first regression)
Xs[[1]]=NULL


#Define a function that regresses y on x and shows us the summary
regress=function(x){summary(lm(y~x))}

#Apply regress over all elements of Xs, i.e.,
#regress y on all possible subsets of regressors
lapply(Xs,regress)


HTH,
Daniel




-- 
View this message in context: http://r.789695.n4.nabble.com/Generating-all-possible-models-from-full-model-tp2222377p2223550.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list