[R] purrr::map and xml2:: read_xml
Ulrik Stervbo
ulrik.stervbo at gmail.com
Fri Jan 6 17:25:22 CET 2017
Hi Maicel,
I'm guessing that B works on 50 files, and that A fails because there is no
function called 'read_xmlmap'. If the function that you map work well,
removing 'dplyr::sample_n(50)' from 'B' should solve the problem.
If that is not the case, we need a bit more information.
HTH
Ulrik
On Fri, 6 Jan 2017 at 17:08 <maicel at infomed.sld.cu> wrote:
> Hi List, I am trying to extract the key words from 1403 papers in xml
> format. I programmed such codes but they do not work but they only do
> with the modification showed below. But that variation is not the one
> I need because the 1403 xml files do not match to those in my folder.
> Could you please tell me where are the mistakes in the codes list (A
> or B) to help me to correct them? The data frame columns are an id and
> the paths.
>
> A-Does not work, but it is the one I need.
>
> keyword <-
> muestra %>%
> select(path) %>%
> read_xmlmap(.f = function(x) { read_xml(x) %>%
> xml_find_all( ".//kwd") %>%
> xml_text(trim=T) })
>
> B-It works but only with a small number of papers.
>
> keyword <-
> muestra %>%
> select(path) %>%
> dplyr::sample_n(50) %>%
> unlist() %>%
> map(.f = function(x) { read_xml(x) %>%
> xml_find_all( ".//kwd") %>%
> xml_text(trim=T) })
>
> Thank you,
> Maicel Monzon MD, PHD
>
>
> ----------------------------------------------------------------
>
>
>
>
> --
> Este mensaje le ha llegado mediante el servicio de correo electronico que
> ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema
> Nacional de Salud. La persona que envia este correo asume el compromiso de
> usar el servicio a tales fines y cumplir con las regulaciones establecidas
>
> Infomed: http://www.sld.cu/
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list