[R] using a variable for a column name in a formula
arun
smartpink111 at yahoo.com
Sun Oct 13 23:53:18 CEST 2013
Hi,
May be:
set.seed(24)
X <- data.frame(weight=sample(100:250,20,replace=TRUE),height=sample(140:190,20,replace=TRUE))
Others <- colnames(X)[!colnames(X)%in%"height"]
nnn <- "height"
res <- lm(formula(paste(nnn,"~",paste(Others, sep="+"))),data=X)
res1<- lm(height~.,data=X)
#or
res2<- lm(get(nnn)~get(Others),data=X) #needs some renaming of rownames
identical(coef(summary(res)),coef(summary(res1)))
#[1] TRUE
A.K.
On Sunday, October 13, 2013 5:06 PM, David Epstein <David.Epstein at warwick.ac.uk> wrote:
lm(height ~ ., data=X)
works fine.
However
nnn <- "height" ; lm(nnn ~ . ,data=X)
fails
How do I write such a formula, which depends on the value of a string variable like nnn above?
A typical application might be a program that takes a data frame containing only numerical data, and figures out which of the columns can be best predicted from all the other columns.
Thanks
David
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list