[R] Handle lot of variables - Regression
Dieter Menne
dieter.menne at menne-biomed.de
Wed Oct 14 16:23:04 CEST 2009
anna0102 wrote:
>
> I've got a data set (e.g. named Data) which contains a lot of variables,
> for example: s1, s2, ..., s50
>
> My first question is:
> It is possible to do this: Data$s1
> But is it also possible to do something like this: Data$s1:s50 (I've tried
> a lot of versions of those without a
> result)
>
>
Use the [] notation. For example
Data[,c("s1","s2","s3")]
or even better
Data[,grep("s.*",names(a),value=TRUE)]
anna0102 wrote:
>
> I want to do a stepwise logistic regression. For this purpose I use the
> following procedures:
> result<-glm(...)
> step(result, direction="forward)
>
> Now the problem I have, is, that I have to include all my 50 variables
> (s1-s50), but I don't want to write them all down like y~s1+s2+s3+s4...
> (furthermore it has to be implemented in a loop, so I really need it).
>
Construct the formula dynamically. But please, start with only 3 or 4
variables and try if it work. Sometimes deep inside functions things can go
wrong with this method, requiring Ripley's game-like workarounds. See
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html
a=data.frame(s=1:10,s2=1:10,s4=1:10)
form = paste("z~",grep("s.*",names(a),value=TRUE),collapse="+")
glm(form,....)
And be aware of the nonsense you can (replace by will certainly) get with
stepwise regression and so many parameters. If I were to be treated by a
cure created by stepwise regression, I would prefer voodoo.
Search for "Harrell stepwise" read Frank's well justified soapboxes.
Dieter
--
View this message in context: http://www.nabble.com/Handle-lot-of-variables---Regression-tp25889056p25892047.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list