[R] re ading and analyzing a word document

cls59 chuck at sharpsteen.net
Thu Oct 1 04:31:41 CEST 2009

PDXRugger wrote:
> Howdy Y'all, 
> So i am looking to read a word document in the following formats(.doc) or
> any type of accessible word processor software (e.g. text .txt, notepad,
> etc).  Had the ability to search certain words, for instance "banana",
> "peacock","Weapons" "Mass" "Destruction".  Then i could summarize and view
> the results.  i looked and the only thing i could find was the below where
> i want to analyze "letter.doc" and look for the words mentioned in quotes
> above.  Its aparently wrong but im wondering if this is even possible. 
> Please advise.  Thanks 
> In Solidarity
> JR

Well... you could make a vector of the words you want to find:

to.find <- c( 'banana', 'peacock', 'Weapons' )

Read in the file...

file.text <- readLines( 'myFile.txt' )

And recursively apply the grep command in order to determine which lines
contain matches for your words:

line.matches <- unlist( lapply( to.find, grep, x = file.text ) )

It may do what you want for plain text files, as for Microsoft Word files...

Sometimes there is a price to pay for using a closed proprietary binary
document format.

Good luck!


Charlie Sharpsteen
Environmental Resources Engineering
Humboldt State University
View this message in context: http://www.nabble.com/reading-and-analyzing-a-word-document-tp25691972p25692279.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list