[R] ols function in rms package
Frank E Harrell Jr
f.harrell at Vanderbilt.Edu
Mon Jun 7 15:22:37 CEST 2010
On 06/06/2010 10:49 PM, Mark Seeto wrote:
> Hello,
>
> I have a couple of questions about the ols function in Frank Harrell's rms
> package.
>
> Is there any way to specify variables by their column number in the data
> frame rather than by the variable name?
>
> For example,
>
> library(rms)
> x1<- rnorm(100, 0, 1)
> x2<- rnorm(100, 0, 1)
> x3<- rnorm(100, 0, 1)
> y<- x2 + x3 + rnorm(100, 0, 5)
> d<- data.frame(x1, x2, x3, y)
> rm(x1, x2, x3, y)
> lm(y ~ d[,2] + d[,3], data = d) # This works
> ols(y ~ d[,2] + d[,3], data = d) # Gives error
> Error in if (!length(fname) || !any(fname == zname)) { :
> missing value where TRUE/FALSE needed
>
> However, this works:
> ols(y ~ x2 + d[,3], data = d)
>
> The reason I want to do this is to program variable selection for
> bootstrap model validation.
>
> A related question: does ols allow "y ~ ." notation?
>
> lm(y ~ ., data = d[, 2:4]) # This works
> ols(y ~ ., data = d[, 2:4]) # Gives error
> Error in terms.formula(formula) : '.' in formula and no 'data' argument
>
> Thanks for any help you can give.
>
> Regards,
> Mark
Hi Mark,
It appears that you answered the questions yourself. rms wants real
variables or transformations of them. It makes certain assumptions
about names of terms. The y ~ . should work though; sometime I'll have
a look at that.
But these are the small questions compared to what you really want. Why
do you need variable selection, i.e., what is wrong with having
insignificant variables in a model? If you indeed need variable
selection see if backwards stepdown works for you. It is built-in to
rms bootstrap validation and calibration functions.
Frank
--
Frank E Harrell Jr Professor and Chairman School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list