[R-pkg-devel] ORCID ID finder via tools::CRAN_package_db() ?
Kurt Hornik
Kurt@Horn|k @end|ng |rom wu@@c@@t
Tue Aug 20 15:47:22 CEST 2024
>>>>> Kurt Hornik writes:
The variant attaches drops the URL and does unique.
Hmm, the ones in
head(with(a, sort_by(a, ~ family + given)), 100)
without a family look suspicious ...
Best
-k
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: orcid.R
URL: <https://stat.ethz.ch/pipermail/r-package-devel/attachments/20240820/76546959/attachment.ksh>
-------------- next part --------------
>>>>> Dirk Eddelbuettel writes:
>> On 20 August 2024 at 07:57, Dirk Eddelbuettel wrote:
>> |
>> | Hi Kurt,
>> |
>> | On 20 August 2024 at 14:29, Kurt Hornik wrote:
>> | | I think for now you could use something like what I attach below.
>> | |
>> | | Not ideal: I had not too long ago starting adding orcidtools.R to tools,
>> | | which e.g. has .persons_from_metadata(), but that works on the unpacked
>> | | sources and not the CRAN package db. Need to think about that ...
>> |
>> | We need something like that too as I fat-fingered the string 'ORCID'. See
>> | fortune::fortunes("Dirk can type").
>> |
>> | Will the function below later. Many thanks for sending it along.
>> Very nice. Resisted my common impulse to make it a data.table for easy
>> sorting via keys etc. After running your code the line
>> head(with(a, sort_by(a, ~ family + given)), 100)
>> shows that we need a bit more QA as person entries are not properly split
>> between 'family' and 'given', use the URL and that we have repeats.
>> Excluding those is next.
> Right. One should canonicalize the ORCID (having the URLs is from being
> nice) and then do unique() ...
> Best
> -k
>> Dirk
>> | Dirk
>> |
>> | |
>> | | Best
>> | | -k
>> | |
>> | | ********************************************************************
>> | | x <- tools::CRAN_package_db()
>> | | a <- lapply(x[["Authors using R"]],
>> | | function(a) {
>> | | if(!is.na(a)) {
>> | | a <- tryCatch(utils:::.read_authors_at_R_field(a),
>> | | error = identity)
>> | | if (inherits(a, "person"))
>> | | return(a)
>> | | }
>> | | NULL
>> | | })
>> | | a <- do.call(c, a)
>> | | a <- lapply(a,
>> | | function(e) {
>> | | if(is.null(o <- e$comment["ORCID"]) || is.na(o))
>> | | return(NULL)
>> | | cbind(given = paste(e$given, collapse = " "),
>> | | family = paste(e$family, collapse = " "),
>> | | oid = unname(o))
>> | | })
>> | | a <- as.data.frame(do.call(rbind, a))
>> | | ********************************************************************
>> | |
>> | | > Salut Thierry,
>> | |
>> | | > On 20 August 2024 at 13:43, Thierry Onkelinx wrote:
>> | | > | Happy to help. I'm working on a new version of the checklist package. I could
>> | | > | export the function if that makes it easier for you.
>> | |
>> | | > Would be happy to help / iterate. Can you take a stab at making the
>> | | > per-column split more robust so that we can bulk-process all non-NA entries
>> | | > of the returned db?
>> | |
>> | | > Best, Dirk
>> | |
>> | | > --
>> | | > dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>> |
>> | --
>> | dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>> --
>> dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
More information about the R-package-devel
mailing list