[R] Using a text file as a removeWord dictionary in tm_map
Sun Shine
phaedrusv at gmail.com
Sat Feb 28 14:46:37 CET 2015
Hi list
Although this query applies specifically to the tm package, perhaps it's
something that others might be able to lend a thought to.
Using tm to do some initial text mining, I want to include an external
(to R) generated dictionary of words that I want removed from the corpus.
I have created a comma separated list of terms in " " marks in a
stopList.txt plain UTF-8 file. I want to read this into R, so do:
> stopDict <- read.table('~/path/to/file/stopList.txt', sep=',')
When I want to load it as part of the removeWords function in tm, I do:
> docs <- tm_map(docs, removeWords, stopDict)
which has no effect. Neither does:
> docs <- tm_map(docs, removeWords, c(stopDict))
What am I not seeing/ doing?
How do I pass a text file with pre-defined terms to the removeWords
transform of tm?
Thanks for any ideas.
Cheers
Sun
More information about the R-help
mailing list