[Rd] Citation of R packages
John Maindonald
john.maindonald at anu.edu.au
Sat Feb 11 00:32:17 CET 2006
Even if a CITATION file is included, there is an issue of what to put
in it.
Authorship of a book or paper is not always the simple matter that might
appear. With an R package, it can be a far from simple matter. We are
trying to adapt a tool, surely, that was designed for different
purposes.
1. I'd like to see the definition of a new BibTeX entry type that has
fields for
additional author details and version number. There is surely some
mechanism for getting agreement on a new entry type.
2. In any case, there's a message for maintainers of packages to include
CITATION files that reflect what they want to appear in any citation,
with
citation("lattice") as maybe a suitable model?
John.
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Mathematical Sciences Institute, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 11 Feb 2006, at 5:36 AM, Friedrich.Leisch at tuwien.ac.at wrote:
>>>>>> On Fri, 10 Feb 2006 21:01:44 +1100,
>>>>>> John Maindonald (JM) wrote:
>
> [...]
>
>> Where there is a published paper or a book (such as MASS), or a
>> manual for which a url can be given, my decision was to include
>> that in the main list of references, but not to include references
>> there that were references to the package itself, which as you
>> suggest below can be a reference to the concatenated help pages.
>
> The CITATION file of a package may contain as many entries as the
> author wants, including both a reference to the help pages and to the
> book (or whatever).
>
>
>> It seemed anyway useful to have a separate list of packages. For
>> consistency, these were always references to the package, with a
>> cross-reference to any relevant document in the references to papers.
>
>>>> (2) Maybe the author field should be more nuanced, or
>>>> maybe ...
>>>
>>> author fields of bibtex entries have a strict format (names
>>> separated
>>> by "and"), what do you mean by "more nuanced"?
>
>> Those named in the list of authors may be any combination of: the
>> authors
>> of an R package, the authors of an original S version, the person or
>> persons
>> responsible for an R port, the authors of the Fortran code, compiler
>> (s), and
>> contributors of ideas.
>
>> For John Fox's car, citation() gives the following:
>> author = {John Fox. I am grateful to Douglas Bates and David
>> Firth and Michael Friendly and Gregor Gorjanc and Georges Monette and
>> Henric Nilsson and Brian Ripley and Sanford Weisberg and and Achim
>> Zeleis for various suggestions and contributions.},
>
>> For Rcmdr:
>> author = {John Fox and with contributions from Michael Ash and
>> Philippe Grosjean and Martin Maechler and Dan Putler and and Peter
>> Wolf.},
>
>> For car, maybe John Fox should be identified as author. For Rcmdr,
>> maybe the other persons that are named should be added?
>
>> For leaps:
>> author = {Thomas Lumley using Fortran code by Alan Miller},
>
>> It seems reasonable to cite Lumley and Miller as authors. Should
>> there be a note that identifies Miller as the contributor of the
>> Fortran code?
>
>> Should the name(s) of porters (usually from S) be included as author
>> (s)? Or should their contribution be acknowledged in the note field?
>> Or ...
>
>> Possibilities are to cite all those individuals as author, or to cite
>> John Fox only,
>> with any combination of no additional information in the note field,
>> or using the
>> note field to explain who did what. The citation() function leaves
>> it unclear who
>> are to be acknowledged as authors, and in fact
>
>
> Umm, the problem there is not the citation() function, but that the
> authors of all those packages obviously have not included a CITATION
> file in their package which overrides the default (extracted from the
> DESCRIPTION file).
>
> E.g., package flexclust has DESCRIPTION
>
> Package: flexclust
> Version: 0.8-1
> Date: 2006-01-11
> Author: Friedrich Leisch, parts based on code by Evgenia Dimitriadou
>
> but
>
> ****
> R> citation("flexclust")
>
> To cite package flexclust in publications use:
>
> Friedrich Leisch. A Toolbox for K-Centroids Cluster Analysis.
> Computational Statistics and Data Analysis, 2006. Accepted for
> publication.
>
> A BibTeX entry for LaTeX users is
>
> @Article{,
> author = {Friedrich Leisch},
> title = {A Toolbox for K-Centroids Cluster Analysis},
> journal = {Computational Statistics and Data Analysis},
> year = {2006},
> note = {Accepted for publication},
> }
> ****
>
> because the CITATION file overrides the DESCRIPTION file. Writing a
> CITATION file is of course also intended for those cases where a
> proper reference cannot be auto-generated from the DESCRIPTION file.
>
>
>>>> (3) In compiling a list of packages, name order seems
>>>> preferable, and one wants the title first (achieved by
>>>> relocating the format.title field in the manual FUNCTION
>>>> in the .bst file
>>>> (4) manual seems not an ideal name for the class, if
>>>> there is no manual.
>>>
>>> A package always has a "reference manual", the concatenated help
>>> pages
>>> certainly qualify as such and can be downloaded in PDF format from
>>> CRAN. The ISBN rules even allow to assign an ISBN number to the
>>> online
>>> help of a software package which also can serve as the ISBN
>>> number of
>>> the *software itself* (which we did for base R).
>
>> I'd prefer some consistency in the way that R packages are
>> referenced.
>> Thus, if reference for one package is to the concatenated help pages,
>> do it that way for all of them.
>
> But we recommend that package authors should (try to) get their work
> into reviewed journals like JSS, JCGS, or CSDA, and then package
> authors usually prefer if the article gets cited. Unfortunately, many
> academic institutions value paper publications higher than software.
> Citing the help pages is mainly intended as a substitute if no journal
> article is available.
>
> Best,
> Fritz
More information about the R-devel
mailing list