[R] wilcox.test loop through variable names

Jacob Wegelin jacobwegelin at fastmail.fm
Mon Nov 16 04:40:06 CET 2009


On Sun, 15 Nov 2009 14:33 -0500, "Jacob Wegelin"
<jacobwegelin at fastmail.fm> wrote:
> 
> Often I perform the same task on a series of variables in a dataframe,
> by looping through a character vector that holds the names and using
> paste(), eval(), and parse() inside the loop.
> 
> For instance:
> 
> rm(environmental)
> thesevars<-names(environmental)
> environmental$ToyReal <-rnorm(nrow(environmental)) 
> environmental$ToyDichot<- environmental$ToyReal < 0.53
> 
> tableOfResults<-data.frame(var=thesevars)
> 
> tableOfResults$p_wilcox <- NA
> 
> tableOfResults$Beta_lm <- NA
> 
> rownames(tableOfResults)<-thesevars
> 
> for( thisvar in thesevars) {
>   	thiscommand<- paste("thiswilcox <- wilcox.test (", thisvar, " ~ ToyDichot , data=environmental)")
>  	eval(parse(text=thiscommand))
>   	tableOfResults[thisvar, "p_wilcox"] <- thiswilcox$p.value
>  	thislm<-lm( environmental[ c( "ToyReal", thisvar )])
>   	tableOfResults[thisvar, "Beta_lm"] <- coef(thislm)[thisvar]
> }
> 
> print(tableOfResults)
> 
> Of course, the loop above is a toy example. In real life I might first
> figure out whether the variable is
> continuous, dichotomous, or categorical taking on several values, then
> perform an operation depending on
> its type.
> 
> The use of paste(), eval(), and parse() seems awkward.  As Gabor
> Grothendieck showed
> (http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we
> are calling a regression function such as lm() we can avoid using
> paste(), as shown above.
> 
> But is there a way to avoid paste() and eval() when one uses t.test()
> or wilcox.test()?

Here is a solution:

rm(environmental)
thesevars<-names(environmental)
environmental$ToyReal <-rnorm(nrow(environmental))
environmental$ToyDichot<- environmental$ToyReal < 0.53

ThisList<-
lapply( environmental[thesevars], function( OneVar ) {
   c(
      p_wilcox= wilcox.test( OneVar ~ environmental$ToyDichot )$p.value
         ,
      Beta_lm = as.numeric(coef(lm( environmental$ToyReal ~ OneVar
      ))["OneVar"])
   )
   }
)

do.call("rbind", ThisList)

Jacob A. Wegelin 
Department of Biostatistics 
Virginia Commonwealth University 
Richmond VA 23298-0032 
U.S.A.




More information about the R-help mailing list