[R] substring comparison

Claus O'Rourke claus.orourke at gmail.com
Thu Apr 29 19:17:52 CEST 2010


Hi all,

I'm writing a script to do some basic text analysis in R. Let's assume
I have a data frame named data which contains a column named 'utt'
which contains strings. Is there a straightforward way to achieve
something like this:

data$ContainsThe <- ifelse(startsWith(data$Utt,"the"),"y","n")

or

data$ContainsThe <- ifelse(contains(data$Utt,"the"),"y","n")
?

I tried using grep
data$ContainsThe <- ifelse(grep("the",data$Utt),"y","n")

but this doesn't work becausee grep only returns the rows for which
grep succeeded.

Thanks for any pointers

Claus



More information about the R-help mailing list