[R] textual analysis - transforming several pdf to txt - naming the files
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Jul 5 17:43:19 CEST 2023
Às 11:12 de 05/07/2023, Cecília Carmo escreveu:
> convertpdf2txt <- function(dirpath){
>
> files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names
> = TRUE)
> files <- chartr("\\", "/", files)
>
> x <- lapply(files, function(x){
> pdftools::pdf_text(x) %>%
> paste0(collapse = " ") %>%
> stringr::str_squish()
> })
> new_names <- tools::file_path_sans_ext(files)
> new_names <- paste(new_names, "txt", sep = ".")
> setNames(x, new_names)
> }
>
> # apply function
> # note that my test files are in "~/Temp"
> txts <- convertpdf2txt(here::here("~", "Temp"))
> names(txts)
>
>
> Thank you very much, but the following error appeared:
>
> Error: unexpected '}' in "}"
>
>
>
>
> Cec�lia Carmo
>
> Universidade de Aveiro
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,
I had tested the code with a couple of PDF's and it ran with no errors
or warnings.
That error is telling that a "}" is not balanced but in my code they all
are, RStudio checks it automatically.
Can you try to check in an editor with syntax highlighting?
Hope this helps,
Rui Barradas
More information about the R-help
mailing list