[R] How do I parse text?
S Ellison
S.Ellison at LGCGroup.com
Tue Sep 10 11:18:44 CEST 2013
> I have a data frame with a character field of the form "ACUTE
> URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", ....
>
> How can I get counts of all the words and their
> co-occurences? I've spent a long time searching on google,
> but it just takes me on a wild goose chase of dozens of
> modules involving advanced natural language processing
> theory. All I want is word counts and co-occurences.
Perhaps a combination of strsplit(), unlist() and table() would do the job?
Example:
sometext <- c("ACUTE URI NOS", "OPEN WOUND OF FOREHEAD", "CROUP", "STREP SORE THROAT", "ACUTE STREP SORE THROAT")
st <- strsplit(sometext, " ")
table(unlist(st))
S Ellison
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}
More information about the R-help
mailing list