[R-pkg-devel] ORCID ID finder via tools::CRAN_package_db() ?
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Tue Aug 20 15:58:49 CEST 2024
Looking into one particular example,
https://github.com/seabbs/idmodelr/blob/master/DESCRIPTION
this appears to be the authors' fault:
Authors using R: c(
person(given = "Sam Abbott",
role = c("aut", "cre"),
email = "contact using samabbott.co.uk",
comment = c(ORCID = "0000-0001-8057-8037")),
person(given = "Akira Endo",
role = c("aut"),
email = "akira.endo using lshtm.ac.uk",
comment = c(ORCID = "0000-0001-6377-7296")))
Maybe CRAN should start checking for missing 'family' fields in
Authors using R ... ???
cheers
Ben Bolker
On 2024-08-20 9:47 a.m., Kurt Hornik wrote:
>>>>>> Kurt Hornik writes:
>
> The variant attaches drops the URL and does unique.
>
> Hmm, the ones in
>
> head(with(a, sort_by(a, ~ family + given)), 100)
>
> without a family look suspicious ...
>
> Best
> -k
>
>
>
>
>>>>>> Dirk Eddelbuettel writes:
>>> On 20 August 2024 at 07:57, Dirk Eddelbuettel wrote:
>>> |
>>> | Hi Kurt,
>>> |
>>> | On 20 August 2024 at 14:29, Kurt Hornik wrote:
>>> | | I think for now you could use something like what I attach below.
>>> | |
>>> | | Not ideal: I had not too long ago starting adding orcidtools.R to tools,
>>> | | which e.g. has .persons_from_metadata(), but that works on the unpacked
>>> | | sources and not the CRAN package db. Need to think about that ...
>>> |
>>> | We need something like that too as I fat-fingered the string 'ORCID'. See
>>> | fortune::fortunes("Dirk can type").
>>> |
>>> | Will the function below later. Many thanks for sending it along.
>
>>> Very nice. Resisted my common impulse to make it a data.table for easy
>>> sorting via keys etc. After running your code the line
>
>>> head(with(a, sort_by(a, ~ family + given)), 100)
>
>>> shows that we need a bit more QA as person entries are not properly split
>>> between 'family' and 'given', use the URL and that we have repeats.
>>> Excluding those is next.
>
>> Right. One should canonicalize the ORCID (having the URLs is from being
>> nice) and then do unique() ...
>
>> Best
>> -k
>
>
>
>>> Dirk
>
>>> | Dirk
>>> |
>>> | |
>>> | | Best
>>> | | -k
>>> | |
>>> | | ********************************************************************
>>> | | x <- tools::CRAN_package_db()
>>> | | a <- lapply(x[["Authors using R"]],
>>> | | function(a) {
>>> | | if(!is.na(a)) {
>>> | | a <- tryCatch(utils:::.read_authors_at_R_field(a),
>>> | | error = identity)
>>> | | if (inherits(a, "person"))
>>> | | return(a)
>>> | | }
>>> | | NULL
>>> | | })
>>> | | a <- do.call(c, a)
>>> | | a <- lapply(a,
>>> | | function(e) {
>>> | | if(is.null(o <- e$comment["ORCID"]) || is.na(o))
>>> | | return(NULL)
>>> | | cbind(given = paste(e$given, collapse = " "),
>>> | | family = paste(e$family, collapse = " "),
>>> | | oid = unname(o))
>>> | | })
>>> | | a <- as.data.frame(do.call(rbind, a))
>>> | | ********************************************************************
>>> | |
>>> | | > Salut Thierry,
>>> | |
>>> | | > On 20 August 2024 at 13:43, Thierry Onkelinx wrote:
>>> | | > | Happy to help. I'm working on a new version of the checklist package. I could
>>> | | > | export the function if that makes it easier for you.
>>> | |
>>> | | > Would be happy to help / iterate. Can you take a stab at making the
>>> | | > per-column split more robust so that we can bulk-process all non-NA entries
>>> | | > of the returned db?
>>> | |
>>> | | > Best, Dirk
>>> | |
>>> | | > --
>>> | | > dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>>> |
>>> | --
>>> | dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>
>>> --
>>> dirk.eddelbuettel.com | @eddelbuettel | edd using debian.org
>>>
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
More information about the R-package-devel
mailing list