[R-pkg-devel] ORCID ID finder via tools::CRAN_package_db() ?

Kurt Hornik Kurt@Horn|k @end|ng |rom wu@@c@@t
Tue Aug 20 15:43:04 CEST 2024


>>>>> Dirk Eddelbuettel writes:

> On 20 August 2024 at 07:57, Dirk Eddelbuettel wrote:
> | 
> | Hi Kurt,
> | 
> | On 20 August 2024 at 14:29, Kurt Hornik wrote:
> | | I think for now you could use something like what I attach below.
> | | 
> | | Not ideal: I had not too long ago starting adding orcidtools.R to tools,
> | | which e.g. has .persons_from_metadata(), but that works on the unpacked
> | | sources and not the CRAN package db.  Need to think about that ...
> | 
> | We need something like that too as I fat-fingered the string 'ORCID'. See
> | fortune::fortunes("Dirk can type").
> | 
> | Will the function below later. Many thanks for sending it along.

> Very nice. Resisted my common impulse to make it a data.table for easy
> sorting via keys etc.  After running your code the line

>    head(with(a, sort_by(a, ~ family + given)), 100)

> shows that we need a bit more QA as person entries are not properly split
> between 'family' and 'given', use the URL and that we have repeats.
> Excluding those is next.

Right.  One should canonicalize the ORCID (having the URLs is from being
nice) and then do unique() ...

Best
-k



> Dirk
 
> | Dirk
> | 
> | | 
> | | Best
> | | -k
> | | 
> | | ********************************************************************
> | | x <- tools::CRAN_package_db()
> | | a <- lapply(x[["Authors using R"]],
> | |             function(a) {
> | |                 if(!is.na(a)) {
> | |                     a <- tryCatch(utils:::.read_authors_at_R_field(a), 
> | |                                   error = identity)
> | |                     if (inherits(a, "person")) 
> | |                         return(a)
> | |                 }
> | |                 NULL
> | |             })
> | | a <- do.call(c, a)
> | | a <- lapply(a,
> | |             function(e) {
> | |                 if(is.null(o <- e$comment["ORCID"]) || is.na(o))
> | |                     return(NULL)
> | |                 cbind(given = paste(e$given, collapse = " "),
> | |                       family = paste(e$family, collapse = " "),
> | |                       oid = unname(o))
> | |             })
> | | a <- as.data.frame(do.call(rbind, a))
> | | ********************************************************************
> | | 
> | | > Salut Thierry,
> | | 
> | | > On 20 August 2024 at 13:43, Thierry Onkelinx wrote:
> | | > | Happy to help. I'm working on a new version of the checklist package. I could
> | | > | export the function if that makes it easier for you.
> | | 
> | | > Would be happy to help / iterate. Can you take a stab at making the
> | | > per-column split more robust so that we can bulk-process all non-NA entries
> | | > of the returned db?
> | | 
> | | > Best, Dirk
> | | 
> | | > -- 
> | | > dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
> | 
> | -- 
> | dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org

> -- 
> dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org



More information about the R-package-devel mailing list