[R] speed issue: gsub on large data frame

SPi simon.pickert at t-online.de
Wed Nov 6 19:57:46 CET 2013


Good idea! 

I'm trying your approach right now, but I am wondering if using str_split
(package: 'stringr') or strsplit is the right way to go in terms of speed? I
ran str_split over the text column of the data frame and it's processing for
2 hours now..? 

I did: 
splittedStrings<-str_split(dataframe$text, " ")

The $text column already contains cleaned text, so no double blanks etc or
unnecessary symbols. Just full words.




--
View this message in context: http://r.789695.n4.nabble.com/speed-issue-gsub-on-large-data-frame-tp4679747p4679904.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list