[R] readPDF() -- unsure how to install xpdf to make this work?

Tony Breyal tony.breyal at googlemail.com
Thu Nov 13 16:10:14 CET 2008


Dear R-Help,

I need to convert a set of '.pdf' files into an equivalent set of
'.txt' files. This is so that i can do some text mining on the
content.

In the latest R-News letter (http://cran.r-project.org/doc/Rnews/
Rnews_2008-2.pdf), the package 'tm' for text mining is mentioned. In
that lovely package, there is a function called 'readPDF()'. In order
to use this, ?readPDF says

    "Note that this PDF reader needs both the tools pdftotext and
pdfinfo installed and accessable on your system."

These tools are available from http://www.foolabs.com/xpdf/download.html

I am able to download this and use it easily from a dos window to
convert a pdf file into a txt file.

Question: how do i make these tools available to R, so that i can use
the readPDF() function?

Thank you in advance for any help, and I hope the above made sense.
Tony Breyal






###OS = Windows Vista Ultimate
>> sessionInfo()
R version 2.8.0 (2008-10-20)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets
methods   base

other attached packages:
[1] tm_0.3-1           XML_1.98-1         Snowball_0.0-3
RWeka_0.3-14       rJava_0.6-0        Matrix_0.999375-16
lattice_0.17-15    filehash_2.0

loaded via a namespace (and not attached):
[1] proxy_0.4-1



More information about the R-help mailing list