[R] Reading stopwords from a csv file

vioravis vioravis at gmail.com
Tue Oct 4 19:17:01 CEST 2011


I am using the tm package to do text miniing:

I have a huge list of stopwords (2000+) that are in a csv file. I read it as
follows:

stopwordlist <- read.csv("stopwords to be Removed 10042011.csv")
myStopwords <- as.character(stopwordlist$stopwords)

When try removing the stopwords using 

tr1=tm_map(tr1,removeWords,myStopwords)

I am getting the following error:

Error in gsub(sprintf("\\b(%s)\\b", paste(words, collapse = "|")), "",  : 
  internal error in compiling regexp

However, this works fine when I define myStopwords = c(....) instead of
reading from the csv file.

Can someone please help me to resolve this issue?

Thank you.

Ravi

--
View this message in context: http://r.789695.n4.nabble.com/Reading-stopwords-from-a-csv-file-tp3871697p3871697.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list