[R-sig-Geo] Extract CRU data

Tue Jan 24 18:29:11 CET 2023

On Tue, 2023-01-24 at 12:13 +0100, Miluji Sb wrote:
> Greetings everyone,
> 
> I have a question on extracting country-level data from CRU (
> https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
> ).

Something like:

To get all links/filenames in one table:

a <-
rvest::read_html("https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
") 

tbl <- a |>
  rvest::html_table() |>
  as.data.frame()

tbl <- tbl[-c(1,2),]

To download them all to specific directory

my_download_function <- function(myurl ="", output_dir = "data") {
  if(!dir.exists({{output_dir}})) {dir.create({{output_dir}})}
  .destfile = paste0({{output_dir}}, "/", {{myurl}})
  .myurl <-
paste0("https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
", {{myurl}})
  download.file(url = .myurl, destfile = .destfile, method = "wget",
extra = "-c --progress=bar:force")  
  NULL
}

invisible(lapply(seq(nrow(tbl)), function(i)
my_download_function(tbl[i,1], "data")))

Now, having it locally you can read them one by one with read.csv,
like:

f <- list.files(path = "data", pattern = "crucy*", full.names = TRUE)
read.csv(f[i], skip = 3, header = TRUE)

It doesn't make sense without adding additional information about
country/territotry, but at least you have starting point.

Regards,
Grzegorz