[R] Mining non-english text
Loris Bennett
loris.bennett at fu-berlin.de
Wed Mar 4 08:52:05 CET 2015
saikiran putta <putta.saikiran1994 at gmail.com> writes:
> I am new to R programming and trying to mine this pdf file
> http://164.100.180.82/Rollpdf/AC276/S24A276P001.pdf. This pdf file is in
> non-English language and I'm not able to figure out how to proceed. And,
> I'm not even sure how to extract information from a PDF file, so please
> help!
>
> [[alternative HTML version deleted]]
>
Nothing to do with R, but the command-line program pdftotxt might help
you to get going and is available for Linux and, apparently, for
Windows. It can deal with various encodings.
Cheers,
Loris
--
This signature is currently under construction.
More information about the R-help
mailing list