[R] Text Mining - Remove punctuation not removing quotes and dashes
Anindya Sankar Dey
anindya55 at gmail.com
Mon Jun 8 08:54:53 CEST 2015
Hi,
I have been doing some text mining. I created the DTM matrix using the
following steps.
corpus1<-VCorpus(VectorSource(resume1$Dat1))
corpus1<-tm_map(corpus1,content_transformer(tolower))
dtm<-DocumentTermMatrix(corpus1,
control = list(removePunctuation = TRUE,
removeNumbers = TRUE,
removeSparseTerms=TRUE,
stopwords = TRUE))
After all the run I am still getting words like -quotation, "fun, model"
, etc.
What can I do about it. I do not need this dahses and extra quotations.
--
Anindya Sankar Dey
[[alternative HTML version deleted]]
More information about the R-help
mailing list