[R] simple question on glm

David Winsemius dwinsemius at comcast.net
Fri Jun 19 00:18:31 CEST 2009


On Jun 18, 2009, at 5:50 PM, Marc Schwartz wrote:

>
> On Jun 18, 2009, at 4:36 PM, Jack Luo wrote:
>
>> Hi,
>>
>> I am trying to use glm to fit my data, wondering if there is a easy  
>> way to
>> fit a glm without typing all the explanatory variable names. For  
>> example, if
>> I have 100 explanatory variables x1, x2, ..., x100 and response  
>> variable is
>> y, I don't want to do something like
>> glm1 <- glm(y ~ x1 + x2 + ... + x100, family = gaussian, data =  
>> dataA)
>> since it would be a lot of typing.
>>
>> Many thanks,
>>
>> -Jack
>
> If y and x1 through x100 are the only variables in dataA, you can use:
>
>  glm(y ~ ., data = dataA)
>
> The '.' in the formula indicates that all variables not already in  
> the formula should be used.
>
> See ?formula for more information.
>

It's also possible to use indexed matrices or dataframes in the  
formula so that if you have an indexable object you can do regressions  
thusly:

glm(datm[,10] ~ datm[,1:9])
# or using just part of that matrix
glm(datm[,10] ~ datm[ , 4:9])
# or non-adjacent selection of columns
glm(datm[,"Y"] ~ datm[,c(2:3,5:7)])

This would put the onus on you to remember which columns are which  
when making up the glm call. You don't get the handy labeling in the  
formulae, but the output will have the column names if they have been  
assigned. If the object is a matrix as this was, then you cannot use  
the data="df.name"  argument.

 > glm(datm[,"Y"] ~ datm[,c(2:3,5:7)])

Call:  glm(formula = datm[, "Y"] ~ datm[, c(2:3, 5:7)])

Coefficients:
           (Intercept)  datm[, c(2:3, 5:7)]X2  datm[, c(2:3, 5:7)]X3   
datm[, c(2:3, 5:7)]X5
                 21.70                   0.00                    
0.00                   0.00
datm[, c(2:3, 5:7)]X6  datm[, c(2:3, 5:7)]X7
                  0.00                   0.00

Degrees of Freedom: 9 Total (i.e. Null);  4 Residual
Null Deviance:	    0
Residual Deviance: 1.262e-28 	AIC: -623

You could wrap the subsetted matrix in as.data.frame(.):

 > glm(Y ~ ., data=as.data.frame(datm[ ,c(2:3,5:7,10)] ) )  # don't  
forget (as I first did) to include "Y"

Call:  glm(formula = Y ~ ., data = as.data.frame(datm[, c(2:3,  
5:7,      10)]))

Coefficients:
(Intercept)           X2           X3           X5            
X6           X7
       21.70         0.00         0.00         0.00          
0.00         0.00

Degrees of Freedom: 9 Total (i.e. Null);  4 Residual
Null Deviance:	    0
Residual Deviance: 1.262e-28 	AIC: -623

-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list