[Rd] suggestion to fix packageDescription() for Windows users

Ben Marwick bmarwick at uw.edu
Sat Jun 17 13:10:18 CEST 2017


Recently I was trying to cite a package where the authors have ä
and ø in their names. I found that on Windows the citation() function 
did not return the authors' names at all, but on Linux there was no 
problem (sessionInfos at the bottom):

On Windows, no author names are returned:

#---------------

 > citation("readr")

To cite package ‘readr’ in publications use:

   (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
   https://CRAN.R-project.org/package=readr

A BibTeX entry for LaTeX users is

   @Manual{,
     title = {readr: Read Rectangular Text Data},
     year = {2017},
     note = {R package version 1.1.1},
     url = {https://CRAN.R-project.org/package=readr},
   }

ATTENTION: This citation information has been auto-generated from the
package DESCRIPTION file and may need manual editing, see
‘help("citation")’.
#---------------

On Linux we do see the author names:

#---------------
 > citation("readr")

To cite package ‘readr’ in publications use:

   Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
   Read Rectangular Text Data. R package version 1.1.1.
   https://CRAN.R-project.org/package=readr

A BibTeX entry for LaTeX users is

   @Manual{,
     title = {readr: Read Rectangular Text Data},
     author = {Hadley Wickham and Jim Hester and Romain Francois},
     year = {2017},
     note = {R package version 1.1.1},
     url = {https://CRAN.R-project.org/package=readr},
   }
#---------------

This appears to be an OS-dependent encoding issue. The citation function 
does not take an encoding argument, so it's not possible to set the 
encoding at the point where that function is used. The citation function 
working with the packageDescription function, which does have an 
encoding argument, but the default is not useful for Windows when there 
is an encoding set in the DESCRIPTION of the package (in this case UTF-8).

We can set the encoding argument in packageDescription so it works in 
Windows to give the authors as expected, but it is very inconvenient to 
generate citations directly from the output of this function. So I'd 
like to propose a solution this problem by changing one line in the 
packageDescription function, like so, from:

#---------------
if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
#---------------

to:

#---------------
if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") | 
unname(Sys.info()['sysname']) == "Windows")
#---------------

If I understand correctly, that will force ASCII//TRANSLIT encoding when 
DESCRIPTION files are read by packageDescription() on Windows machines. 
The upside is that Windows users will get the authors in the package 
citation, unlike the current situation. The downside is that the exotic 
symbols in the authors' names are replaced with common ones that are 
similar.

I think getting the citations to easily include the authors' names is 
pretty important, even if their names have exotic characters, so this is 
worth fixing. Is this edit to packageDescription the best way to solve 
this problem of exotic characters preventing the authors' names from 
showing on Windows?

thanks,

Ben




Windows sessionInfo

#---------------
 > sessionInfo()
R version 3.4.0 Patched (2017-05-10 r72670)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
  [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3 
tools_3.4.0
  [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16 
rlang_0.1.1
[11] fortunes_1.5-4
#---------------

Linux sessionInfo:

#---------------
 > sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.10

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.3.1 yaml_2.1.14 knitr_1.16
#---------------



More information about the R-devel mailing list