[R-pkg-devel] ORCID ID finder via tools::CRAN_package_db() ?

Ben Bolker bbo|ker @end|ng |rom gm@||@com
Tue Aug 20 15:58:49 CEST 2024


  Looking into one particular example,

https://github.com/seabbs/idmodelr/blob/master/DESCRIPTION

this appears to be the authors' fault:

Authors using R: c(
     person(given = "Sam Abbott",
            role = c("aut", "cre"),
            email = "contact using samabbott.co.uk",
            comment = c(ORCID = "0000-0001-8057-8037")),
     person(given = "Akira Endo",
            role = c("aut"),
            email = "akira.endo using lshtm.ac.uk",
            comment = c(ORCID = "0000-0001-6377-7296")))

   Maybe CRAN should start checking for missing 'family' fields in 
Authors using R ... ???

   cheers
    Ben Bolker

On 2024-08-20 9:47 a.m., Kurt Hornik wrote:
>>>>>> Kurt Hornik writes:
> 
> The variant attaches drops the URL and does unique.
> 
> Hmm, the ones in
> 
>    head(with(a, sort_by(a, ~ family + given)), 100)
> 
> without a family look suspicious ...
> 
> Best
> -k
> 
> 
> 
> 
>>>>>> Dirk Eddelbuettel writes:
>>> On 20 August 2024 at 07:57, Dirk Eddelbuettel wrote:
>>> |
>>> | Hi Kurt,
>>> |
>>> | On 20 August 2024 at 14:29, Kurt Hornik wrote:
>>> | | I think for now you could use something like what I attach below.
>>> | |
>>> | | Not ideal: I had not too long ago starting adding orcidtools.R to tools,
>>> | | which e.g. has .persons_from_metadata(), but that works on the unpacked
>>> | | sources and not the CRAN package db.  Need to think about that ...
>>> |
>>> | We need something like that too as I fat-fingered the string 'ORCID'. See
>>> | fortune::fortunes("Dirk can type").
>>> |
>>> | Will the function below later. Many thanks for sending it along.
> 
>>> Very nice. Resisted my common impulse to make it a data.table for easy
>>> sorting via keys etc.  After running your code the line
> 
>>> head(with(a, sort_by(a, ~ family + given)), 100)
> 
>>> shows that we need a bit more QA as person entries are not properly split
>>> between 'family' and 'given', use the URL and that we have repeats.
>>> Excluding those is next.
> 
>> Right.  One should canonicalize the ORCID (having the URLs is from being
>> nice) and then do unique() ...
> 
>> Best
>> -k
> 
> 
> 
>>> Dirk
>   
>>> | Dirk
>>> |
>>> | |
>>> | | Best
>>> | | -k
>>> | |
>>> | | ********************************************************************
>>> | | x <- tools::CRAN_package_db()
>>> | | a <- lapply(x[["Authors using R"]],
>>> | |             function(a) {
>>> | |                 if(!is.na(a)) {
>>> | |                     a <- tryCatch(utils:::.read_authors_at_R_field(a),
>>> | |                                   error = identity)
>>> | |                     if (inherits(a, "person"))
>>> | |                         return(a)
>>> | |                 }
>>> | |                 NULL
>>> | |             })
>>> | | a <- do.call(c, a)
>>> | | a <- lapply(a,
>>> | |             function(e) {
>>> | |                 if(is.null(o <- e$comment["ORCID"]) || is.na(o))
>>> | |                     return(NULL)
>>> | |                 cbind(given = paste(e$given, collapse = " "),
>>> | |                       family = paste(e$family, collapse = " "),
>>> | |                       oid = unname(o))
>>> | |             })
>>> | | a <- as.data.frame(do.call(rbind, a))
>>> | | ********************************************************************
>>> | |
>>> | | > Salut Thierry,
>>> | |
>>> | | > On 20 August 2024 at 13:43, Thierry Onkelinx wrote:
>>> | | > | Happy to help. I'm working on a new version of the checklist package. I could
>>> | | > | export the function if that makes it easier for you.
>>> | |
>>> | | > Would be happy to help / iterate. Can you take a stab at making the
>>> | | > per-column split more robust so that we can bulk-process all non-NA entries
>>> | | > of the returned db?
>>> | |
>>> | | > Best, Dirk
>>> | |
>>> | | > --
>>> | | > dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>>> |
>>> | --
>>> | dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
> 
>>> -- 
>>> dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
 > E-mail is sent at my convenience; I don't expect replies outside of 
working hours.



More information about the R-package-devel mailing list