[R] Need content_transformer() called by tm_map() to change non-letters to spaces

Mike mikehall at y7mail.com
Thu Apr 23 22:10:41 CEST 2015


Hello,
In the following code, any characters matching  "/|@| \\|") will be changed to a space. 
> library(tm)
> toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x))
> docs <- tm_map(docs, toSpace, "/|@| \\|")

What code would transform all non-letters to a space?  (What goes where the xxxxx's are.)It is very difficult to put all non-letters in a string...  So I'm doing the opposite of the above.
> toSpace_2 <- content_transformer(function xxxxxxxxxxxxxxxxxxxxxxx))
> docs <- tm_map(docs, toSpace_2, "abcdefghijklmnopqrstuvwxyz")

This needs to be done by a content_transformer() function to maintain the integrity of docs.

Thanks
 
	[[alternative HTML version deleted]]



More information about the R-help mailing list