[R] Extract lines from pdf files
tg@77m @end|ng |rom y@hoo@com
Tue Nov 19 23:52:20 CET 2019
I can extract specific data from lines in a pdf using:
txt <- pdf_text(".pdf")
con <- file('mydata.txt')
serial <- read.table(con,skip=5,nrow=1) #Extractflatness <- read.table(con,skip=11,nrow=1)# Extract 
parallel1 <-read.table(con,skip=2,nrow=1)# Extract 
parallel2 <-read.table(con,skip=4,nrow=1)# Extract 
# note here that serial has 4 variables
# flatness had 6 variables
# parallel1 has 5 variables
# parallel2 has 5 variables
# this outputs the specific data I need
parallel1 # Note here that the txt format shows 0.0007not scientific, is there a way to format this to display the original data?
parallel2 # Note here that the txt format shows 0.0006not scientific, , is there a way to format this to display the original data?
I'd like to extend this code to all of the pdf files in adirectory and to generate a table of all the serial, flatness, parallel1 andparallel2 data.
I'm not having a lot of success trying to build thescript for this. Some pointers would be appreciated.
All the best.
Statistician / Senior Quality Engineer
[[alternative HTML version deleted]]
More information about the R-help