[R] Conditional Loop For Data Frame Columns

David Winsemius dwinsemius at comcast.net
Mon Jan 9 02:43:26 CET 2012


On Jan 8, 2012, at 4:48 PM, jawbonemurphy wrote:

> Hi,
>
> I am trying to create a script that will evaluate each column of a  
> data
> frame, regardless of # columns, using some function and sorting the  
> results
> by an index vector:

?lapply
?"["
?order

> #upload data (112 rows x 73 columns)
> SD <- read.csv("/Users/johnjacob/Desktop/StudentsData_RInput.csv",
> header=TRUE)
>
> #assign index vector
> ID <- SD[ ,2]
>
> #write indexed mean function
> meanfun <- function(x) {
> for(i in 3:ncol(x)) {
>  meanSD <- tapply(x[,i], ID, FUN=mean)}

Aren't you worried about over-writing meanSD? this would appear to  
leave meanSD with only the result from the last column.

> return(meanSD)
> }
>

What are you expecting to get back? 'tapply' will very possibly return  
a matrix.

> #apply function to data
> meanfun(SD)
>
> What I get is one set of indexed means:
>
> 7605   Andrea    Billy   ERR006    FJM13
> 2.111111 1.400000 1.888889 3.692308 3.750000
>   Gayan  Jschaef  Whitney
> 1.300000 2.285714 2.000000
>
> ...and what I would like to generate is a set of indexed means

By indexed you mean grouped? Perhaps you should be looking at ?aggregate

> for each
> column in the data set.
> Any guidance would be much appreciated!


>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list