[Rd] RFC: Kerning, postscript() and pdf()

Thu Oct 16 11:03:11 CEST 2008

I've now implemented B and C in R-devel, with C as the default.

On Sun, 12 Oct 2008, Prof Brian Ripley wrote:

> Ei-ji Nakama has pointed out (from another Japanese user, I believe) that 
> postscript() and pdf() have not been handling kerning correctly, and this is 
> a request for opinions about how we should correct it.
>
> Kerning is the adjustment of the spacing between letters from their natural 
> width, so that for example 'Yo' is usually typeset with the o closer to the Y 
> than 'Yl' would be.  Kerning is not very well standardized, so that for 
> example R's default Helvetica and its URW clone (Nimbus Sans) have quite 
> different ideas of the amount of kerning corrections for 'Yo'. This matters, 
> because not many people actually see Helvetica when viewing R's PostScript or 
> PDF output, but rather a similar face like Nimbus Sans or Arial, or in the 
> case of Acrobat Reader, a not very similar face.  Kerning is only a feature 
> of some proportionally spaced fonts and so not of Courier nor CJK fonts.
>
> The current position (R <= 2.8.0) is that string widths have been computing 
> using kerning from the Adobe Font Metric files for the nominal font, but the 
> strings have been displayed without using kerning (at least in the viewers we 
> are aware of, and the PostScript and PDF reference manuals mandate that 
> behaviour, if rather obscurely).  This means that in strings such as 'You', 
> the width used in the string placement differs from that actually displayed.
>
> For postscript(), this doesn't have much impact, as centring or right 
> justification ('hadj' in text()) is done by PostScript code and computes the 
> width from the actual font used (and so copes well with font substitution). 
> It might affect the fine layout in plotmath, but using strings which would be 
> kerned in annotations is not common.
>
> For pdf() the effect is more commonly seen, as all text is set 
> left-justified, and the computed width is used to centre/right-justify.
>
> There are several things we could do:
>
> A.  Do nothing, for back compatibility.  After all, this has been going on 
> for years and no one has complained until last month.
>
> B.  Ignore kerning, and hence change the string width computations to match 
> the current display.  This is more attractive than it appears at first sight 
> -- as far as I know all other devices ignore kerning, and we are increasingly 
> used to seeing 'typeset' output without kerning.  It would be desirable when 
> copying graphs by e.g. dev.copy2eps from devices that do not kern.
>
> C.  Insert kerning corrections by splitting up strings, so e.g. 'You' is set 
> as (Y)-140 kc(ou): this is what TeX engines do.
>
> D.  Compute the position of each letter in the string and place them 
> individually.
>
> C and D would give visually identical output when the font used is exactly as 
> specified, and hopefully also when a substitute font is using with the same 
> glyph widths (as substituting Nimbus Sans for Helvetica, at least for some 
> versions of each), but where the substitute is a poor match, C ought to look 
> more elegant but line up less well.  D would produce much larger files than 
> C.
>
> We do have the option of not changing the output when there is no kerning. 
> That would be by far the most common case except that some fonts (including 
> Helvetica but not Nimbus Sans) kern between punctuation and a space, e.g. ', 
> '.  I'm inclined to believe that most uses of ',' in R graphical output are 
> not punctuation (certainly true of R's own examples), and also that we 
> nowadays do not expect to see kerning involving spaces.
>
> Ei-ji Nakama provided an implementation of C for pdf() and D for postscript() 
> (thanks Ei-ji, and apologies that we did not have a chance to discuss the 
> principles first).  I'm inclined to suggest that we should go forwards with 
> at most two of these alternatives, and those two should be the same for 
> postscript() and pdf() -- my own inclination is to B and C.
>
> So questions:
>
> 1) Do people feel strongly that we should preserve graphical output from past 
> versions of R, even when there are known bugs?  I can see the need to 
> reproduce published figures, but normally this would also need using the same 
> version of R.
>
> 2) Is kerning worth pursuing?
>
> 3) If so, is elegant looking output more important than exact layout?
>
> 4) If we allow kerning, should it be the default (or only) option?
>
> To see that sometimes there can be a large effect, try in postscript() or 
> pdf()
>
> xx <- 'You You You You You You You You'
> plot(0,0,xlim=c(0,1),ylim=c(0,1),type='n')
> abline(v=0)
> text(0, 0.5, xx, adj=0)
> abline(v=strwidth(xx))
> x2 <- strsplit(xx, "")
> w <- sapply(x2, strwidth)
> abline(v=sum(w))
>
> The leftmost of the right pair of lines is the computed width, the rightmost 
> the (normal) displayed width.
>
> Unless there are cogent reasons to bring this forward to 2.8.1, any changes 
> would be as from 2.9.0.
>
> Brian Ripley
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595