[R] Ocr

Achim Zeileis Achim.Zeileis at uibk.ac.at
Wed Jul 27 00:43:36 CEST 2016


On Wed, 27 Jul 2016, Shane Carey wrote:

> Cool, thanks Jim!!
> I would love to be able to write my own script for this as I have many
> images/ pdf's in a folder and would like to batch process them using an R
> script!!

The underlying engine is "tesseract" which is also available as a 
command-line tool and on other OSs. In principle, it is not hard to call 
it with a system() command and then readLines() the resulting text. 
However, it might be useful to play with the available options in the GUI 
first to see what works best for your images.

> Thanks
>
> On Tuesday, July 26, 2016, Jim Lemon <drjimlemon at gmail.com> wrote:
>
>> Hi Shane,
>> FreeOCR is a really good place to start.
>>
>> http://www.paperfile.net/
>>
>> Jim
>>
>>
>> On Wed, Jul 27, 2016 at 6:11 AM, Shane Carey <careyshan at gmail.com
>> <javascript:;>> wrote:
>>> Hi,
>>>
>>> Has anyone ever done any ocr in R?? I have some scanned images that I
>> would
>>> like to convert to text!!
>>> Thanks
>>>
>>>
>>> --
>>> Le gach dea ghui,
>>> Shane
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and
>> more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> -- 
> Le gach dea ghui,
> Shane
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list