[Rd] Suggestion / patch to support more Unicode characters in R CMD Rd2pdf
Mikko Korpela
mikko.korpela at aalto.fi
Wed Jul 4 23:01:42 CEST 2012
Hi list,
When using R CMD Rd2pdf, it is possible to set environment variable
RD2PDF_INPUTENC to value "inputenx" and enjoy better support for UTF-8
characters (see ?Rd2pdf). This enables LaTeX package "inputenx" instead
of "inputenc".
Even better support for UTF-8 encoded characters can be had by better
using the facilities provided by inputenx and making R CMD Rd2pdf insert
a line to its temporary .tex file: "\input{ix-utf8enc.dfu}". The
instructions are found in section 1.2 "Unicode" of the inputenx manual:
http://mirror.ctan.org/macros/latex/contrib/oberdiek/inputenx.pdf
I suggest that R CMD Rd2pdf automatically insert
"\input{ix-utf8enc.dfu}" to its temporary .tex file when a combination
of inputenx and UTF-8 is detected. The attached small patch does that.
A demo package is also attached (tarball built manually, not R CMD
build). It uses some UTF-8 characters not supported without the patch: R
CMD Rd2pdf gives an error, propagated from LaTeX. With the patch
installed, R CMD Rd2pdf works OK when RD2PDF_INPUTENC=inputenx is set.
For testing, unpack tarball and run R CMD Rd2pdf on the resulting
directory. Tested on R development version r59731 running on Ubuntu
10.10 64 bit.
--
Mikko Korpela
Aalto University School of Science
Department of Information and Computer Science
-------------- next part --------------
A non-text attachment was scrubbed...
Name: encTest3.tar.gz
Type: application/x-gzip
Size: 2429 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20120705/5e61ce90/attachment.gz>
-------------- next part --------------
Index: src/library/tools/R/Rd2pdf.R
===================================================================
--- src/library/tools/R/Rd2pdf.R (revision 59731)
+++ src/library/tools/R/Rd2pdf.R (working copy)
@@ -466,12 +466,17 @@
inputenc <- Sys.getenv("RD2PDF_INPUTENC", "inputenc")
## this needs to be canonical, e.g. 'utf8'
## trailer is for detection if we want to edit it later.
+ latex_outputEncoding <- latex_canonical_encoding(outputEncoding)
setEncoding <-
paste("\\usepackage[",
- latex_canonical_encoding(outputEncoding), "]{",
+ latex_outputEncoding, "]{",
inputenc, "} % @SET ENCODING@", sep="")
useGraphicx <- "% \\usepackage{graphicx} % @USE GRAPHICX@"
writeLines(c(setEncoding,
+ if (inputenc == "inputenx" &&
+ latex_outputEncoding == "utf8") {
+ "\\input{ix-utf8enc.dfu}"
+ },
useGraphicx,
if (index) "\\makeindex{}",
"\\begin{document}"), out)
@@ -545,21 +550,28 @@
latexEncodings <- unique(latexEncodings)
latexEncodings <- latexEncodings[!is.na(latexEncodings)]
cyrillic <- if (nzchar(Sys.getenv("_R_CYRILLIC_TEX_"))) "utf8" %in% latexEncodings else FALSE
- latex_outputEncoding <- latex_canonical_encoding(outputEncoding)
encs <- latexEncodings[latexEncodings != latex_outputEncoding]
if (length(encs) || hasFigures || cyrillic) {
lines <- readLines(outfile)
+ moreUnicode <- inputenc == "inputenx" && "utf8" %in% encs
encs <- paste(encs, latex_outputEncoding, collapse=",", sep=",")
if (!cyrillic) {
- lines[lines == setEncoding] <-
+ setEncoding2 <-
paste0("\\usepackage[", encs, "]{", inputenc, "}")
} else {
- lines[lines == setEncoding] <-
+ setEncoding2 <-
paste(
"\\usepackage[", encs, "]{", inputenc, "}
\\IfFileExists{t2aenc.def}{\\usepackage[T2A]{fontenc}}{}", sep = "")
}
+ if (moreUnicode) {
+ setEncoding2 <-
+ paste0(
+setEncoding2, "
+\\input{ix-utf8enc.dfu}")
+ }
+ lines[lines == setEncoding] <- setEncoding2
if (hasFigures)
lines[lines == useGraphicx] <- "\\usepackage{graphicx}\\setkeys{Gin}{width=0.7\\textwidth}"
writeLines(lines, outfile)
More information about the R-devel
mailing list