[R] re ading and analyzing a word document
chuck at sharpsteen.net
Thu Oct 1 04:31:41 CEST 2009
> Howdy Y'all,
> So i am looking to read a word document in the following formats(.doc) or
> any type of accessible word processor software (e.g. text .txt, notepad,
> etc). Had the ability to search certain words, for instance "banana",
> "peacock","Weapons" "Mass" "Destruction". Then i could summarize and view
> the results. i looked and the only thing i could find was the below where
> i want to analyze "letter.doc" and look for the words mentioned in quotes
> above. Its aparently wrong but im wondering if this is even possible.
> Please advise. Thanks
> In Solidarity
Well... you could make a vector of the words you want to find:
to.find <- c( 'banana', 'peacock', 'Weapons' )
Read in the file...
file.text <- readLines( 'myFile.txt' )
And recursively apply the grep command in order to determine which lines
contain matches for your words:
line.matches <- unlist( lapply( to.find, grep, x = file.text ) )
It may do what you want for plain text files, as for Microsoft Word files...
Sometimes there is a price to pay for using a closed proprietary binary
Environmental Resources Engineering
Humboldt State University
View this message in context: http://www.nabble.com/reading-and-analyzing-a-word-document-tp25691972p25692279.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help