[R-pkg-devel] Intrinsic UTF-8 use in aspired CRAN package
Uwe Ligges
||gge@ @end|ng |rom @t@t|@t|k@tu-dortmund@de
Thu May 18 12:05:17 CEST 2023
On 18.05.2023 10:03, Martin Maechler wrote:
>>>>>> Schuhmacher, Dominic
>>>>>> on Wed, 17 May 2023 12:05:49 +0000 writes:
>
> > Dear list, I have a package
> > https://github.com/dschuhmacher/kanjistat whose very
> > purpose depends on working with Japanese kanji characters
> > (in UTF-8 encoding). Such characters appear vitally in the
> > data sets, examples, tests, the vignette and the .Rd
> > files.
>
> > My package checks fine with devtools::check on my system
> > and via Github Actions produced with
> > usethis::use_github_action_check_standard(). However, I
> > would like to release the package on CRAN, and running R
> > CMD check --as-cran gives me a number of headaches, mainly
> > related to the production of pdf documents via latex as it
> > seems to be not so easy to convince latex to typeset
> > Japanese, see
> > https://www.overleaf.com/learn/latex/Japanese
>
> > For the vignette, I can set in the Rmarkdown file
> > pdf_document: latex_engine: lualatex includes: in_header:
> > preamble.tex and in the file preamble.tex
> > \usepackage{luatexja} \usepackage{microtype} This gives me
> > a pdf-vignette that looks and checks fine (except that the
> > abovementioned GitHub Actions don't seem to find lualatex,
> > which is why the pdf output is commented out in the main
> > branch on GitHub).
>
> > Unfortunately, I fail to find a similar solution for the
> > pdf manual. R CMD check yields
> > --------------
> > checking PDF version of manual ... WARNING LaTeX errors
> > when creating PDF version. This typically indicates Rd
> > problems. LaTeX errors found: ! Package inputenc Error:
> > Unicode character 冷 (U+51B7) (inputenc) not set up for
Can you send me a minimal example package with these characters in an Rd
file?
Best,
Uwe Ligges
> > use with LaTeX. [and many more of the same] * checking
> > PDF version of manual without index ... ERROR
> > --------------
> > It seems that the pdf manual is generated by first
> > producing a texinfo file and then running texi2dvi. From
> > https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Inserting-Unicode.html
> > I take the message that texinfo does not do Japanese... Is
> > there any way to work around the use of texinfo and use
> > lualatex (with a preamble) instead? If not, is there a way
> > to keep the UTF-8 encoded characters in the html help (I
> > think this is very useful for the user!) and still produce
> > a pdf that passes the check, e.g. by replacing the kanji
> > characters automatically by their codepoints (or even a
> > generic placeholder symbol) when generating the pdf
> > manual?
>
> I cannot help much more,
> but be assured that texinfo is *not* used in the process
> It's just a "historical coincidence" that texi2dvi , a "simple"
> shell script, typically comes from the texinfo ("software
> package", i.e., in Linux distributions the texi2dvi command
> (shell script, see above) is provided by the 'texinfo'
> (Debian/Ubuntu/..) package
>
> man texi2dvi tells you about a sleuth of environment variables,
> notably PDFLATEX TEX etc and I guess you can just set one of
> these to 'lualatex' .. .. and of course lualatex must be
> findable on the CRAN servers but I'd bet that to be the case.
>
> Best,
> Martin
>
>
>
> > Any thoughts and suggestions on this would be greatly
> > appreciated! I think/hope then that the remaining problems
> > in R CMD check are acceptable to the CRAN team given the
> > nature of my package. They are:
>
> > 1. Examples and tests fail if the check is not run in an
> > UTF-8 locale.
>
> > 2. checking data for non-ASCII characters ... NOTE Note:
> > found 111752 marked UTF-8 strings
>
> > Many thanks, Dominic Schuhmacher
>
>
>
>
> > ______________________________________________
> > R-package-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
More information about the R-package-devel
mailing list