[R] readPDF() -- unsure how to install xpdf to make this work?
Tony Breyal
tony.breyal at googlemail.com
Thu Nov 13 16:10:14 CET 2008
Dear R-Help,
I need to convert a set of '.pdf' files into an equivalent set of
'.txt' files. This is so that i can do some text mining on the
content.
In the latest R-News letter (http://cran.r-project.org/doc/Rnews/
Rnews_2008-2.pdf), the package 'tm' for text mining is mentioned. In
that lovely package, there is a function called 'readPDF()'. In order
to use this, ?readPDF says
"Note that this PDF reader needs both the tools pdftotext and
pdfinfo installed and accessable on your system."
These tools are available from http://www.foolabs.com/xpdf/download.html
I am able to download this and use it easily from a dos window to
convert a pdf file into a txt file.
Question: how do i make these tools available to R, so that i can use
the readPDF() function?
Thank you in advance for any help, and I hope the above made sense.
Tony Breyal
###OS = Windows Vista Ultimate
>> sessionInfo()
R version 2.8.0 (2008-10-20)
i386-pc-mingw32
locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets
methods base
other attached packages:
[1] tm_0.3-1 XML_1.98-1 Snowball_0.0-3
RWeka_0.3-14 rJava_0.6-0 Matrix_0.999375-16
lattice_0.17-15 filehash_2.0
loaded via a namespace (and not attached):
[1] proxy_0.4-1
More information about the R-help
mailing list