[R] Basic question about re-writing for loop as a function

Chris Beeley chris.beeley at gmail.com
Mon Aug 29 15:55:31 CEST 2011


Hello-

Sorry to ask a basic question, but I've spent many hours on this now
and seem to be missing something.

I have a loop that looks like this:

    mainmat=data.frame(matrix(data=0, ncol=92, nrow=length(predata$Words_MH)))

    for(i in 1:length(predata$Words_MH)){
    for(j in 1:92){

    mainmat[i,j]=ifelse(j %in%
as.numeric(unlist(strsplit(predata$Words_MH[i], split=","))), 1, 0)

    }
    }

What it's doing is creating a matrix with 92 columns, that's the
number of different codes, and then for every row of my data it looks
to see if the code (code 1, code 2, etc.) is in the string and if it
is, returns a 1 in the relevant column (column 1 for code 1, column 2
for code 2, etc.)

There are 1000 rows in the database, and I have to run several
versions of this code, so it just takes way too long, I have been
trying to rewrite using lapply. I tried this:

    myfunction=function(x, y) ifelse(x %in%
as.numeric(unlist(strsplit(predata$Words_MH[y], split=","))), 1, 0)

    for(j in 1:92){
    mainmat[,j]= lapply(predata$Words, myfunction)
    }

but I don't think I can use something that takes two inputs, and I
can't seem to remove either.

Here's a dput of the first 10 rows of the variable in case that's helpful:

predata$Words=c("1", "1", "1", "1", "2,3,4", "5", "1", "1", "6", "7,8,9,10")

Given these data, I want the function to return, for the first column,
1, 1, 1, 1, 0, 0, 1, 1, 0, 0 (because those are the values of Words
which contain a 1) and for the second column return 0, 0, 0, 0, 1, 0,
0, 0, 0, 0 (because the fifth value is the only one that contains a
2).

Any suggestions gratefully received!

Chris Beeley
Institute of Mental Health, UK



More information about the R-help mailing list