[R] How can I get this function to work?

Paul Miller pjmiller_57 at yahoo.com
Fri Jun 1 20:51:44 CEST 2012


Hello Bert and Sarah,

Thank you for your replies. Helped me understand how people might perceive my question and why they might not respond. 

Spent some time learning about R's debugging tools this morning. Began to realize why my function didn't work. My second argument was the name of a variable. What I didn't realize is that R would immediately expect this to be a previously defined object. I had thought that passing the name of the variable to the body of the function would generate a correct line of code, and that this was all that was required to get the function to work.

Below is a function that does work, at least when applied to a single row of data. I had previously been reading about the Split-Apply-Combine  strategy in a paper about the plyr package. The paper advocates coming up with a function that works for a subset of one's data and then using plyr to split up the data and apply the function to each of the subsets. Was under the impression that this last part would be easy. Seems not to be the case though.

So on to the next part.

Thanks again for your feedback. 

Paul


#### Test row ####

testRow <-
structure(list(profile_key = structure(6L, .Label = c("001-001 ", 
"001-002 ", "001-003 ", "001-004 ", "001-005 ", "001-006 ", "001-007 "
), class = "factor"), encounter_date = structure(4L, .Label = c(" 2009-03-01 ", 
" 2009-03-22 ", " 2009-04-01 ", " 2010-03-01 ", " 2010-04-01 ", 
" 2010-10-15 ", " 2010-11-15 ", " 2011-03-01 ", " 2011-03-14 ", 
" 2011-04-01 ", " 2011-10-10 ", " 2011-10-24 ", " 2012-09-15 ", 
" 2012-10-05 ", " 2012-10-17 "), class = "factor"), raw = " if patient kras result is wild type they will start erbitux several lines of material ordered kras mutation test 11112011 results are still not available "), .Names = c("profile_key", 
"encounter_date", "raw"), row.names = 13L, class = "data.frame")

testRow

#### Function for selecting words within specified range of a target term ####

nearTerms <- function(df, rawtext, target, before, after, reduced){ 
   Text <- unlist(strsplit(df[,rawtext], " "))
   Target <- grep(target, Text)

   if (length(Target) == 0) {df <- transform(df, outtext = "")} else{ 

   Length <- length(Text)
   Keep <- rep(NA, Length)
   Lower <- ifelse(Target - before > 0, Target - before, 1)
   Upper <- ifelse(Target + after < Length, Target + after, Length)

   for(i in 1:length(Keep)){
   for(j in 1:length(Lower)){
      Keep[i][i %in% seq(Lower[j], Upper[j])] <- i
   }}

   df <- transform(df, outtext = paste(Text[!is.na(Keep)], collapse=" "))

   }

   names(df)[names(df) == "outtext"] <- reduced
   df <- df
}

testRow <- nearTerms(df = testRow, rawtext = "raw", target = "kras", before = 6, after = 6, reduced = "reduced")
testRow



More information about the R-help mailing list