[R] scanning a pdf scan
ggrothendieck at gmail.com
Fri Oct 27 18:52:51 CEST 2006
I don't have specific experience with this but strapply
of package gsubfn can extract information from a string by content
as opposed to delimiters. e.g.
> strapply("abc34def56xyz", "[0-9]+", c)[]
 "34" "56"
On 10/27/06, roger koenker <rkoenker at uiuc.edu> wrote:
> I have a pdf scan of several pages of data from a quite famous old
> paper by
> C.S. Pierce (1873). I would like (what else?) to convert it into an
> R dataframe.
> Somewhat to my surprise the pdf seems to already be in a character
> form, since I can search for numerical strings and they are nicely
> found. Of
> course, as is usual with such tables there are also headings and
> column lines, etc
> etc. that are less interesting than the numbers themselves. I've
> tried saving the
> pdf in various formats, some of which look vaguely tractable, but I'm
> that there is something that is more automatic.
> Does anyone have experience that they could share toward this objective?
> url: www.econ.uiuc.edu/~roger Roger Koenker
> email rkoenker at uiuc.edu Department of Economics
> vox: 217-333-4558 University of Illinois
> fax: 217-244-6678 Champaign, IL 61820
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help