We can now take the list of authors extracted from the previous step,
or an independently gathered list of names, and clean and deduplicate
it.
Harmonize differently abbreviated names
The expand_names()
function can be used to expand
differently abbreviated names to a common form, passed in the
expanded
argument:
expand_names(c("Ada Lovelace", "A Lovelace"), expanded = "Ada Lovelace")
#> [1] "Ada Lovelace" "Ada Lovelace"
However, a common pattern is to pass the vector to clean itself in
expanded
. This way, you can harmonize names to their
longest form in the vector, even if you do not know the full name of all
authors in advance:
my_names <- c("Ada Lovelace", "A Lovelace", "Charles Babbage")
expand_names(my_names, my_names)
#> [1] "Ada Lovelace" "Ada Lovelace" "Charles Babbage"