Re: [Rd] RFC: Kerning, postscript() and pdf()

Prof Brian Ripley Thu, 16 Oct 2008 02:04:07 -0700

I've now implemented B and C in R-devel, with C as the default.


On Sun, 12 Oct 2008, Prof Brian Ripley wrote:

Ei-ji Nakama has pointed out (from another Japanese user, I believe) thatpostscript() and pdf() have not been handling kerning correctly, and this isa request for opinions about how we should correct it.
Kerning is the adjustment of the spacing between letters from their naturalwidth, so that for example 'Yo' is usually typeset with the o closer to the Ythan 'Yl' would be. Kerning is not very well standardized, so that forexample R's default Helvetica and its URW clone (Nimbus Sans) have quitedifferent ideas of the amount of kerning corrections for 'Yo'. This matters,because not many people actually see Helvetica when viewing R's PostScript orPDF output, but rather a similar face like Nimbus Sans or Arial, or in thecase of Acrobat Reader, a not very similar face. Kerning is only a featureof some proportionally spaced fonts and so not of Courier nor CJK fonts.
The current position (R <= 2.8.0) is that string widths have been computingusing kerning from the Adobe Font Metric files for the nominal font, but thestrings have been displayed without using kerning (at least in the viewers weare aware of, and the PostScript and PDF reference manuals mandate thatbehaviour, if rather obscurely). This means that in strings such as 'You',the width used in the string placement differs from that actually displayed.
For postscript(), this doesn't have much impact, as centring or rightjustification ('hadj' in text()) is done by PostScript code and computes thewidth from the actual font used (and so copes well with font substitution).It might affect the fine layout in plotmath, but using strings which would bekerned in annotations is not common.
For pdf() the effect is more commonly seen, as all text is setleft-justified, and the computed width is used to centre/right-justify.
There are several things we could do:
A. Do nothing, for back compatibility. After all, this has been going onfor years and no one has complained until last month.
B. Ignore kerning, and hence change the string width computations to matchthe current display. This is more attractive than it appears at first sight-- as far as I know all other devices ignore kerning, and we are increasinglyused to seeing 'typeset' output without kerning. It would be desirable whencopying graphs by e.g. dev.copy2eps from devices that do not kern.
C. Insert kerning corrections by splitting up strings, so e.g. 'You' is setas (Y)-140 kc(ou): this is what TeX engines do.
D. Compute the position of each letter in the string and place themindividually.
C and D would give visually identical output when the font used is exactly asspecified, and hopefully also when a substitute font is using with the sameglyph widths (as substituting Nimbus Sans for Helvetica, at least for someversions of each), but where the substitute is a poor match, C ought to lookmore elegant but line up less well. D would produce much larger files thanC.
We do have the option of not changing the output when there is no kerning.That would be by far the most common case except that some fonts (includingHelvetica but not Nimbus Sans) kern between punctuation and a space, e.g. ','. I'm inclined to believe that most uses of ',' in R graphical output arenot punctuation (certainly true of R's own examples), and also that wenowadays do not expect to see kerning involving spaces.
Ei-ji Nakama provided an implementation of C for pdf() and D for postscript()(thanks Ei-ji, and apologies that we did not have a chance to discuss theprinciples first). I'm inclined to suggest that we should go forwards withat most two of these alternatives, and those two should be the same forpostscript() and pdf() -- my own inclination is to B and C.
So questions:
1) Do people feel strongly that we should preserve graphical output from pastversions of R, even when there are known bugs? I can see the need toreproduce published figures, but normally this would also need using the sameversion of R.
2) Is kerning worth pursuing?

3) If so, is elegant looking output more important than exact layout?

4) If we allow kerning, should it be the default (or only) option?
To see that sometimes there can be a large effect, try in postscript() orpdf()
xx <- 'You You You You You You You You'
plot(0,0,xlim=c(0,1),ylim=c(0,1),type='n')
abline(v=0)
text(0, 0.5, xx, adj=0)
abline(v=strwidth(xx))
x2 <- strsplit(xx, "")
w <- sapply(x2, strwidth)
abline(v=sum(w))
The leftmost of the right pair of lines is the computed width, the rightmostthe (normal) displayed width.
Unless there are cogent reasons to bring this forward to 2.8.1, any changeswould be as from 2.9.0.
Brian Ripley

--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


--
Brian D. Ripley,                  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RFC: Kerning, postscript() and pdf()

Reply via email to