[R] wilcox.test loop through variable names
Jacob Wegelin
jacobwegelin at fastmail.fm
Sun Nov 15 20:33:09 CET 2009
Often I perform the same task on a series of variables in a dataframe,
by looping through a character vector that holds the names and using
paste(), eval(), and parse() inside the loop.
For instance:
rm(environmental)
thesevars<-names(environmental)
environmental$ToyReal <-rnorm(nrow(environmental))
environmental$ToyDichot<- environmental$ToyReal < 0.53
tableOfResults<-data.frame(var=thesevars)
tableOfResults$p_wilcox <- NA
tableOfResults$Beta_lm <- NA
rownames(tableOfResults)<-thesevars
for( thisvar in thesevars) {
thiscommand<- paste("thiswilcox <- wilcox.test (", thisvar, " ~ ToyDichot , data=environmental)")
eval(parse(text=thiscommand))
tableOfResults[thisvar, "p_wilcox"] <- thiswilcox$p.value
thislm<-lm( environmental[ c( "ToyReal", thisvar )])
tableOfResults[thisvar, "Beta_lm"] <- coef(thislm)[thisvar]
}
print(tableOfResults)
Of course, the loop above is a toy example. In real life I might first figure out whether the variable is
continuous, dichotomous, or categorical taking on several values, then perform an operation depending on
its type.
The use of paste(), eval(), and parse() seems awkward. As Gabor Grothendieck showed
(http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we
are calling a regression function such as lm() we can avoid using
paste(), as shown above.
But is there a way to avoid paste() and eval() when one uses t.test()
or wilcox.test()?
Thanks
Jacob A. Wegelin
Department of Biostatistics
Virginia Commonwealth University
Richmond VA 23298-0032
U.S.A.
More information about the R-help
mailing list