[R] textual analysis - transforming several pdf to txt - naming the files
Cecília Carmo
cec|||@@c@rmo @end|ng |rom u@@pt
Wed Jul 5 12:12:12 CEST 2023
convertpdf2txt <- function(dirpath){
files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names
= TRUE)
files <- chartr("\\", "/", files)
x <- lapply(files, function(x){
pdftools::pdf_text(x) %>%
paste0(collapse = " ") %>%
stringr::str_squish()
})
new_names <- tools::file_path_sans_ext(files)
new_names <- paste(new_names, "txt", sep = ".")
setNames(x, new_names)
}
# apply function
# note that my test files are in "~/Temp"
txts <- convertpdf2txt(here::here("~", "Temp"))
names(txts)
Thank you very much, but the following error appeared:
Error: unexpected '}' in "}"
Cec�lia Carmo
Universidade de Aveiro
[[alternative HTML version deleted]]
More information about the R-help
mailing list