[R] How do I parse text?

S Ellison S.Ellison at LGCGroup.com
Tue Sep 10 11:18:44 CEST 2013


> I have a data frame with a character field of the form "ACUTE 
> URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", ....
> 
> How can I get counts of all the words and their 
> co-occurences?  I've spent a long time searching on google, 
> but it just takes me on a wild goose chase of dozens of 
> modules involving advanced natural language processing 
> theory.  All I want is word counts and co-occurences.

Perhaps a combination of strsplit(), unlist() and table() would do the job? 

Example:

sometext <- c("ACUTE URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", "ACUTE STREP SORE THROAT")

st <- strsplit(sometext, " ")

table(unlist(st))

S Ellison


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list