[R-pkg-devel] Check Error Due to Unicode in Documentation
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Thu Jul 23 22:58:07 CEST 2020
On 23/07/2020 4:14 p.m., bill using denney.ws wrote:
> Hello,
>
>
>
> I have a personal package that I�d eventually like to clean up and either
> find other packages to be homes for the functions or perhaps eventually
> release it on CRAN. To that end, I try to keep package checks working.
>
>
>
> One of the functions that I use is to try to simplify Unicode text to ASCII.
> With that, I tend to receive data that is scientifically-focused to the mu
> character should be converted to a �u� instead of the standard conversion to
> �m�. On top of that, there are at least two Unicode characters that are
> visually the mu character, one is the micro character and the other is an
> actual lowercase mu. This function converts both of those to �u� as
> desired.
>
>
>
> I generate the documentation using roxygen2, but the text in the
> documentation aligns with the expected Unicode character, so I think the
> issue is not with roxygen.
>
>
>
> The issue is that Codoc gives the following error:
>
>
>
> * checking for code/documentation mismatches ... WARNING
>
> Codoc mismatches from documentation object 'unicode_to_ascii':
>
> unicode_to_ascii.character
>
> Code: function(x, verbose = FALSE, pattern = c("μ", "µ"), replacement
>
> = c("u", "u"), general_
>
>
>
> But, the code and documentation appear to be the same. I think that the
> issue relates to something with Unicode support in Codoc, but I�m not sure
> how to test for that. The code is here:
>
>
>
> https://github.com/billdenney/bsd.report/blob/454caf217c5b333af1d65c7e63bbad
> 4194320e07/R/unicode_to_ascii.R#L28-L31
>
>
>
> And the documentation is here:
>
>
>
> https://github.com/billdenney/bsd.report/blob/454caf217c5b333af1d65c7e63bbad
> 4194320e07/man/unicode_to_ascii.Rd#L17-L24
>
>
>
> Do you have any suggestions on how to make this code/documentation work with
> Codoc?
If you change the source to include the explicit characters (i.e. use
pattern = c("μ", "µ") instead of pattern=c("\u03bc", "\u00b5")), does
that help?
It may cause other issues: WRE recommends against including UTF-8 chars
in source code.
If that doesn't solve the problem, then it looks like an issue with
Roxygen2. I don't know if there's a way to tell it not to convert \u
escapes into the corresponding character. If there isn't, it seems like
that's something they should add. As a workaround, is there a way to
say that this one particular .Rd file should be edited by hand, instead
of auto-generated?
Duncan Murdoch
More information about the R-package-devel
mailing list