[R] Combine recursive lists in a single list or data frame and write it to file
Ek Esawi
e@@wiek @ending from gm@il@com
Thu Dec 20 03:32:54 CET 2018
Thank you Bert. I don't see how unlist will help. I want to combine
them but keep the "rectangular structure",e.g. list, data frame,
matrix because i want to get the tables in their original form.
Unlist converts the whole output to a single vector; unless i am
missing something.
On Wed, Dec 19, 2018 at 9:10 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> Does ?unlist not help? Why not?
>
> Bert
>
>
> On Wed, Dec 19, 2018, 5:13 PM Ek Esawi <esawiek using gmail.com wrote:
>>
>> Hi All—
>>
>> I am using the R tabulizer package to extract tables from pdf files.
>> The output is a set of lists of matrices. The package extracts tables
>> and a lot of extra stuff which is nearly impossible to clean with
>> RegEx. So, I want to clean it manually.
>> To do so I need to (1) combine all lists in a single list or data
>> frame and (2) then write the single entity to a text file to edit it.
>> I could not figure out how.
>>
>> I tried something like this but did not work.
>> lapply(MyTables, function(x)
>> lapply(x,write.table(file="temp.txt",append = TRUE)))
>>
>> Any help is greatly appreciated.
>>
>> Here is my code:
>>
>> install.packages("rJava") ;library(rJava)
>> install.packages("tabulizer");library(tabulizer)
>> MyPath <- "C:/Users/name/Documents/tEMP"
>> ExtTable <- function (Path,CalOrd){
>> FileNames <- dir(Path, pattern =".(pdf|PDF)",full.names = TRUE)
>> MyFiles <- lapply(FileNames, function(i) extract_tables(i,method = "stream"))
>> if(CalOrd == "Yes"){
>> MyOFiles <- gsub("(\\s.*)|(.pdf|.PDF)","",basename(FileNames))
>> MyOFiles <- match(MyOFiles,month.name)
>> MyNFiles <- MyFiles[order(MyOFiles)]}
>> else
>> MyFiles
>> }
>> MyTables <- ExtTable(Path=MyPath,CalOrd = "No")
>>
>> Here is cleaned portion of the output: The whole output consists of 3
>> lists, each contains 12, 15, and 12 sub-lists.
>>
>> [[2]][[2]]
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>> [1,] "" "Avg." "+_ lo" "n" "Med." "" "Avg." "+_
>> lo" "n" "Med."
>> [2,] "SiOz" "44.0" "1.26" "375" "44.1" "Nb" "4.8" "6.3"
>> "58" "2.7"
>> [3,] "T i O 2" "0.09" "0.09" "561" "0.09" "Mo(b)" "50" "30"
>> "3" "35"
>> [4,] "A1203" "2.27" "1.10" "375" "2.20" "Ru(b)" "12.4" "4.1"
>> "3" "12"
>> [5,] "FeO total" "8.43" "1.14" "375" "8.19" "Pd(b)" "3.9" "2.1"
>> "19" "4.1"
>> [6,] "MnO" "0.14" "0.03" "366" "0.14" "Ag(b)" "6.8" "8.3"
>> "17" "4.8"
>> [7,] "MgO" "41.4" "3.00" "375" "41.2" "Cd(b)" "41" "14"
>> "16" "37"
>> [8,] "CaO" "2.15" "1.11" "374" "2.20" "In(b)" "12" "4"
>> "19" "12"
>> [9,] "Na20" "0.24" "0.16" "341" "0.21" "Sn(b)" "54" "31"
>> "6" "36"
>> [10,] "K20" "0.054" "0.11" "330" "0.028" "Sb(b)" "3.9" "3.9"
>> "11" "3.2"
>> [11,] "P205" "0.056" "0.11" "233" "0.030" "Te(b)" "11" "4"
>> "18" "10"
>> [12,] "Total" "98.88" "" "" "98.43" "Cs(b)" "10" "16"
>> "17" "1.5"
>> [13,] "" "" "" "" "" "Ba" "33" "52"
>> "75" "17"
>> [14,] "Mg-value" "89.8" "1.1" "375" "90.0" "La" "2.60" "5.70"
>> "208" "0.77"
>> [15,] "Ca/AI" "1.28" "1.6" "374" "1.35" "Ce" "6.29" "11.7"
>> "197" "2.08"
>> [16,] "AI/Ti" "22" "29" "361" "22" "Pr" "0.56" "0.87"
>> "40" "0.21"
>> [17,] "F e / M n" "60" "10" "366" "59" "Nd" "2.67" "4.31"
>> "162" "1.52"
>> [18,] "" "" "" "" "" "Sm" "0.47" "0.69"
>> "214" "0.25"
>> [19,] "Li" "1.5" "0.3" "6" "1.5" "Eu" "0.16" "0.21"
>> "201" "0.097"
>> [20,] "B" "0.53" "0.07" "6" "0.55" "Gd" "0.60" "0.83"
>> "67" "0.31"
>> [21,] "C" "110" "50" "13" "93" "Tb" "0.070"
>> "0.064" "146" "0.056"
>> [22,] "F" "88" "71" "15" "100" "Dy" "0.51" "0.35"
>> "58" "0.47"
>> [23,] "S" "157" "77" "22" "152" "Ho" "0.12" "0.14"
>> "54" "0.090"
>> [24,] "C1" "53" "45" "15" "75" "Er" "0.30" "0.22"
>> "52" "0.28"
>> [25,] "Sc" "12.2" "6.4" "220" "12.0" "Tm" "0.038"
>> "0.026" "40" "0.035"
>> [26,] "V" "56" "21" "132" "53" "Yb" "0.26" "0.14"
>> "201" "0.27"
>> [27,] "Cr" "2690" "705" "325" "2690" "Lu" "0.043"
>> "0.023" "172" "0.045"
>> [28,] "Co" "112" "10" "166" "111" "Hf" "0.27" "0.30"
>> "71" "0.17"
>> [29,] "Ni" "2160" "304" "308" "2140" "Ta" "0.40" "0.51"
>> "38" "0.23"
>> [30,] "Cu" "11" "9" "94" "9" "W(b)" "7.2" "5.2"
>> "6" "4.0"
>> [31,] "Zn" "65" "20" "129" "60" "Re(b)" "0.13" "0.11"
>> "18" "0.09"
>> [32,] "Ga" "2.4" "1.3" "49" "2.4" "Os(b)" "4.0" "1.8"
>> "18" "3.7"
>> [33,] "Ge" "0.96" "0.19" "19" "0.92" "Ir(b)" "3.7" "0.9"
>> "34" "3.0"
>> [34,] "As" "0.11" "0.07" "7" "0.10" "Pt(b)" "7" "-"
>> "1" "-"
>> [35,] "Se" "0.041" "0.056" "18" "0.025" "Au(b)" "0.65" "0.53"
>> "30" "0.5"
>> [36,] "Br" "0.01" "0.01" "6" "0.01" "Tl(b)" "1.2" "1.0"
>> "13" "0.9"
>> [37,] "Rb" "1,9" "4.8" "97" "0.38" "Pb" "0.16" "0.11"
>> "17" "0.16"
>> [38,] "Sr" "49" "60" "110" "20" "Bi(b)" "1.7" "0.7"
>> "13" "1.6"
>> [39,] "Y" "4.4" "5.5" "86" "3.1" "Th*" "0.71" "1.2"
>> "71" "0.22"
>> [40,] "Zr" "21" "42" "82" "8.0" "U" "0.12" "0.23"
>> "48" "0.040"
>> [[2]][[4]]
>> [,1] [,2] [,3] [,4] [,5]
>> [,6]
>> [1,] "" "Spinel peridotites" "" "Garnet peridotites"
>> "" "Primitive"
>> [2,] "" "Avg. Meal." "M-A sp" "M-A gt B-M"
>> "Jordan" "mantle"
>> [3,] "SiO 2" "44.0 44.1" "44.15" "44.99 45.00"
>> "45.55" "44.8"
>> [4,] "TiO 2" "0.09 0.09" "0.07" "0.06 0.08"
>> "0.11" "0.21"
>> [5,] "A1203" "2.27 2.20" "1.96" "1.40 1.31"
>> "1.43" "4.45"
>> [6,] "Cr203" "0.39 0.39" "0.44" "0.32 0.38"
>> "0.34" "0.43"
>> [7,] "FeOtotal" "8.43 8.19" "8.28" "7.89 6.97"
>> "7.61" "8.40"
>> [8,] "Mn O" "0.14 0.14" "0.12" "0.11 0.13"
>> "0.11" "0.14"
>> [9,] "MgO" "41.4 41.2" "42.25" "42.60 44.86"
>> "43.55" "37.2"
>> [10,] "NiO" "0.27 0.27" "0.27" "0.26 0.29"
>> "-" "0.24"
>> [11,] "CaO" "2.15 2.20" "2.08" "0.82 0.77"
>> "1.05" "3.60"
>> [12,] "Na 20" "0.24 0.21" "0.18" "0.11 0.09"
>> "0.14" "0.34"
>> [13,] "K 2 0" "0.054 0.028" "0.05" "0.04 0.10"
>> "0.11" "0.028"
>> [14,] "P205" "0.056 0.030" "0.02" "- 0.01"
>> "-" "0.022"
>> [15,] "Total" "99.49 99.05" "99.87" "98.60 100.00"
>> "100.00" "99.86"
>> [16,] "Mg-value" "89.8 90.0" "90.1" "90.6 92.0"
>> "91.1" "88.8"
>> [17,] "olivine" "62 63" "67" "65 68"
>> "66" "56 57"
>> [18,] "opx" "24 24" "22" "28 25"
>> "28" "22 17"
>> [19,] "cpx" "12 11" "9" "3 2"
>> "3" "19 10"
>> [20,] "spinel" "2 2" "2" "- -"
>> "-" "3 -"
>>
>> Here is portion of the output for str(MyTables):
>>
>> str(MyTables)
>>
>> List of 3
>> $ :List of 12
>> $ : chr [1:3, 1:2] "south of the artificial lake Lokka. Intrusive
>> complexes" "of alkaline rocks are found at Sokli (phosphorite-bear-"
>> "ing and a possible Nb-occurrence) in Finland, and at" "(Eriksson,
>> 1992). During this period, Northern Europe" ...
>> ..$ : chr [1:55, 1:15] "Element" "Ag" "Al" "Al_XRF" ...
>> ..$ : chr [1:56, 1:2] "in the till is mainly of local origin,
>> although some cob-" "bles and boulders may have been transported over
>> sev-" "eral kilometres. The moraine formations in the study" "area are
>> mostly gravelly and sandy tills, locally hum-" ...
>> ..$ : chr [1:53, 1:2] "requisites. PCA accounts for maximum variance
>> of all" "variables, while FA is based on the correlation structure"
>> "of the variables. The model of factor analysis allows that" "the
>> common factors do not explain the total variation of" ...
>> ..$ : chr [1:54, 1:7] "lished examples of the use of factor
>> analysis, it is neglec-" "ted that regional geochemical (and
>> environmental) data" "almost never follow a normal distribution.
>> Continuing Method" "with factor analysis in such a case must lead to
>> biased" ...
>> ..$ : chr [1:16, 1:2] "shows the factor loadings of the different
>> variables" "entering each factor. Names of variables with an abso-"
>> "lute value of the loadings <0.3 are not plotted. Fig. 5" "shows 8
>> results of factor analyses using a selection of all" ...
>> ..$ : chr [1:21, 1:2] "pretable results, notwithstanding the fact
>> that on the" "basis of the foregoing discussion it should probably
>> not" "be used with these data. Do these results warrant the use" "of a
>> quite work-intensive method? Unfortunately not," ...
>> ..$ : chr [1:55, 1:8] "" "Ag" "Al" "Al_XRF" ...
>> ..$ : chr [1:23, 1:2] "addition, geochemical reasoning (e.g.
>> geochemical asso-" "ciations and/or pathfinder elements for different
>> types of" "ore deposits) was used to select further sub-sets of vari-"
>> "ables. In geochemistry, the selection of elements entered" ...
>> ..$ : chr [1:55, 1:2] "Fig. 10C cuts several geological units, and
>> is most likely" "indicative of alteration processes related to a
>> deep-" "seated fault. It was revealed again in a factor analysis"
>> "carried out with all those elements extracted by aqua" ...
>> ..$ : chr [1:50, 1:2] "well justified in stating that it is not very
>> scientific to" "play with the selection of elements and number of
>> fac-" "tors extracted until one
>> â\200\230â\200\230findsâ\200\231â\200\231 an
>> â\200\230â\200\230interestingâ\200\231â\200\231 result." "On the other
>> hand, even all the different results pre-" ...
>> ..$ : chr [1:24, 1:2] "Niemelä, J., Ekman, I., Lukashov, A. (Eds.),
>> 1993. Quaternary" "Deposits of Finland and Northwestern Part of
>> Russian Fed-" "eration and Their Resources 1:1,000,000. Geological
>> Survey" "of Finland, Espoo, Finland." ...
>> $ :List of 15
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list