Get started with authoritative

library(authoritative)

This package has two main categories of functionality:

Extracting R package author information

R package authors can be specified in two ways in the DESCRIPTION file.

Extracting R package author information from the Author field

The authors can be listed in the Author field, as free text. However, this method is now actively discouraged by CRAN, and many R user communities, such as rOpenSci.

We would for example have:

Author: Ada Lovelace and Charles Babbage

Because this is free-text, it could be formatted in many different ways, and it is hard to programmatically extract the names. The parse_authors() function provided by the package is designed to split at common delimiter and clean common extra words.

parse_authors("Ada Lovelace and Charles Babbage")
#> [1] "Ada Lovelace"    "Charles Babbage"
parse_authors("Ada Lovelace, Charles Babbage")
#> [1] "Ada Lovelace"    "Charles Babbage"
parse_authors("Ada Lovelace with contributions from Charles Babbage")
#> [1] "Ada Lovelace"    "Charles Babbage"
parse_authors("Ada Lovelace, Charles Babbage, et al.")
#> [1] "Ada Lovelace"    "Charles Babbage"

Extracting R package author information from the Authors@R field

The authors can also be listed in Authors@R field as a string containing R code that generates a vector of person objects. This is the most modern and recommended way to specify authors in the DESCRIPTION file.

Authors@R: c(
  person("Ada Lovelace", role = c("aut", "cre"), email = "ada@email.com"),
  person("Charles Babbage", role = "aut")
)
auts <- parse_authors_r("c(
  person('Ada Lovelace', role = c('aut', 'cre'), email = 'ada@email.com'),
  person('Charles Babbage', role = 'aut')
)")

class(auts)
#> [1] "person"

str(auts)
#> List of 2
#>  $ :Class 'person'  hidden list of 1
#>   ..$ :List of 4
#>   .. ..$ given : chr "Ada Lovelace"
#>   .. ..$ family: NULL
#>   .. ..$ role  : chr [1:2] "aut" "cre"
#>   .. ..$ email : chr "ada@email.com"
#>  $ :Class 'person'  hidden list of 1
#>   ..$ :List of 4
#>   .. ..$ given : chr "Charles Babbage"
#>   .. ..$ family: NULL
#>   .. ..$ role  : chr "aut"
#>   .. ..$ email : NULL
#>  - attr(*, "class")= chr "person"

print(auts)
#> [1] "Ada Lovelace <ada@email.com> [aut, cre]"
#> [2] "Charles Babbage [aut]"

If we only want the names, we can use the format.person() base R function:

format(auts, include = c("given", "family"))
#> [1] "Ada Lovelace"    "Charles Babbage"

Cleaning and deduplicating author names

We can now take the list of authors extracted from the previous step, or an independently gathered list of names, and clean and deduplicate it.

Harmonize differently abbreviated names

The expand_names() function can be used to expand differently abbreviated names to a common form, passed in the expanded argument:

expand_names(c("Ada Lovelace", "A Lovelace"), expanded = "Ada Lovelace")
#> [1] "Ada Lovelace" "Ada Lovelace"

However, a common pattern is to pass the vector to clean itself in expanded. This way, you can harmonize names to their longest form in the vector, even if you do not know the full name of all authors in advance:

my_names <- c("Ada Lovelace", "A Lovelace", "Charles Babbage")
expand_names(my_names, my_names)
#> [1] "Ada Lovelace"    "Ada Lovelace"    "Charles Babbage"