[R-pkg-devel] Intrinsic UTF-8 use in aspired CRAN package

Schuhmacher, Dominic dom|n|c@@chuhm@cher @end|ng |rom m@them@t|k@un|-goett|ngen@de
Thu May 18 14:48:01 CEST 2023


Hi Ivan,

Thanks for the extensive answer. Both
PDFLATEX=xelatex R CMD Rd2pdf .
and
PDFLATEX=lualatex R CMD Rd2pdf .
work for me (for the whole package doc).

And yes, in both cases I need to inject code in the preamble of the .tex file. In fact for lualatex (which I prefer from my experience with the vignette)
PDFLATEX=lualatex RD2PDF_INPUTENC='inputenc}\usepackage{luatexja' R CMD Rd2pdf .
generates the desired manual with the correct characters.

That the font styles for the keywords are not salvaged from Rd.sty seems to be unfortunate and could possibly be fixed with some \renewcommands (which luckily go into the preamble ;-), but that would probably be too much of a hack even for my taste...

By the way, what is the recommended way of setting environment variables like PDFLATEX and (possibly) RD2PDF_INPUTENC in a package (if this is something that is allowed on CRAN)? If it is a Makevars file, where do I put it, directly into the man folder?

Best regards,
Dominic




> On 18. May 2023, at 13:21, Ivan Krylov <krylov.r00t using gmail.com> wrote:
> 
> В Wed, 17 May 2023 12:05:49 +0000
> "Schuhmacher, Dominic"
> <dominic.schuhmacher using mathematik.uni-goettingen.de> пишет:
> 
>> checking PDF version of manual ... WARNING
>> LaTeX errors when creating PDF version.
>> This typically indicates Rd problems.
>> LaTeX errors found:
>> ! Package inputenc Error: Unicode character 冷 (U+51B7)
>> (inputenc) not set up for use with LaTeX.
> 
> I see you'd like to use Kanji characters in your R documentation (not
> only a vignette). There are some workarounds for Cyrillic alphabets
> (that work if you set a special environment variable), but quite a lot
> more hurdles will need to be traversed for CJK support, and I'm not
> sure that CRAN will accept the result even if you overcome them on your
> own machine.
> 
> 1. You might need to switch the LaTeX engine from the default of
> pdflatex. (XeLaTeX in particular seems to have much better Unicode
> support.) Both the texi2dvi shell script and R's emulation of it
> understand the PDFLATEX environment variable (thank you Martin for
> mentioning this!), but I'm not sure there is a way to require an
> environment variable to be set for all invocations of R CMD INSTALL.
> Anyway, as Overleaf says, pdflatex can support CJK, but in a less
> convenient manner.
> 
> 2. For pdflatex, it's possible to use \usepackage{CJKutf8}. The
> required Debian packages are latex-cjk-japanese-wadalab (fonts) and
> latex-cjk-common (CJKutf8.sty itself). There's no way to require these
> packages to be installed on machines where your package's PDF
> documentation might be built.
> 
> 3. Once the packages are installed and you can compile an example *.tex
> file containing Kanji, it's time to get R's PDF documentation system to
> use these packages. You need to insert \usepackage{CJKutf8} in the
> document's preamble (which is too late for Rd \out{} markup). I don't
> see a way to convince Rd2pdf to do that, but there's a terrible hack to
> do that using a LaTeX injection from an undocumented environment
> variable.
> 
> 4. All uses of CJK characters need to be wrapped in
> \begin{CJK}{utf8}{min} ... \end{CJK}. Thankfully, this at least can be
> achieved in Rd using \if{latex}{\out{\begin{CJK}{utf8}{min}}} and can
> be wrapped in an Rd macro using \newcommand in man/macros/whatever.Rd.
> 
> Unfortunately, I couldn't find a way to wrap the \examples{} section in
> \begin{CJK}...\end{CJK}, so CJK characters cannot be used there.
> 
> To summarise, the Rd file from
> <https://paste.debian.net/hidden/f5baacd9/> can be compiled using the
> following command line on a computer with CJKutf8.sty and wadalab fonts
> installed:
> 
> RD2PDF_INPUTENC='inputenc}\usepackage{CJKutf8' \
> R CMD Rd2pdf foo.Rd
> 
> ...but it's such a fragile tower of hacks that I wouldn't use it in an
> actual package.
> 
> What about switching to XeLaTeX? PDFLATEX=xelatex R CMD Rd2pdf bar.Rd
> doesn't crash, but doesn't show CJK characters either, because it's not
> told which CJK font to use. (Besides, Rd.sty seems to set fonts in ways
> that XeLaTeX doesn't quite understand.) \setCJKmainfont{...} is again a
> preamble command, which again requires the terrible hack from (3), and
> I don't see a way to use \fontspec{} without altering the preamble.
> 
> -- 
> Best regards,
> Ivan




More information about the R-package-devel mailing list