Hi, Jeff Conrad wrote on Sat, Feb 02, 2019 at 05:46:59AM +0000: > On Friday, February 1, 2019 3:31 PM, Ingo Schwarze wrote:
>> And the correct way to mark up a single-quoted string in low-level >> roff(7) is \(oq...\(cq, with the rendering decided by the output >> device. > I think this gets to the essence of the matter. The character table > for -Tascii should recognize that the ASCII character set doesn't have > opening or closing single quotes, and accordingly maps both to \(aq. > In a sense, this is a glyph diddle, but it's one that, at least in my > experience (and I go back to the actual typewriter), has been > universally established practice. The same cannot be said for mapping > \(oq to \(ga, which strikes me as akin to treating O and 0 and > l and 1 as interchangeable. > > I think "modernise" is a misnomer here, because I suggest that the > existing mapping isn't archaic; rather, it's always been wrong. While this is an interesting argument, does add a new aspect to the discussion, and opens up a new way to look at the conflict, see below, i fear the statement above, as it stands, is incorrect. A short and intriguing overview of the early history of ASCII 0x60 is given in: http://jkorpela.fi/latin1/ascii-hist.html#60 The first version of US-ASCII, ASA X3.4-1963, had character position 0x60 "unassigned": http://worldpowersystems.com/ARCHIVE/codes/X3.4-1963/page5.JPG In the second version of US-ASCII, ASA X3.4-1965, character position 0x60 was "@" https://web.archive.org/web/20100116001012/http://homepages.cwi.nl/~dik/english/codes/stand.html#ascii The third version of US-ASCII, ASA X3.4-1965, seems to be the first having ` at 0x60 (same source as for -1965). The latest US-ASCII standard, ANSI INCITS 4-1986 (R2007), http://sliderule.mraiow.com/w/images/7/73/ASCII.pdf says on page 16; 0x60 LEFT SINGLE QUOTATION MARK, GRAVE ACCENT with this footnote: These characters should not be used in international interchange without determining that there is agreement between sender and recipient (see Section B5 in Appendix B). which appears to go back to at least RFC 20 (yes, *twenty*), 1969: http://art.tools.ietf.org/html/rfc20 (page 5) That said, nowadys, US-ASCII arguably remains most relevant becose it was chosen as a basic for Unicode. While there are cases where Unicode defines characters as ambiguous, consider for example U+002D HYPHEN-MINUS, U+0060 is not defined as ambiguous: https://www.unicode.org/Public/11.0.0/charts/CodeCharts.pdf is very clear that U+0060 is a grave accent and *not* an opening single quote. While that of course cannot retroactively change what ASCII used to define in the 1960ies to 1980ies, i do think an argument can be made that there is value in discontinuing usage of ASCII that conflicts with Unicode before we enter the third decade of the new millenium. [...] > Is some history lost with the proposed changed? Sure. But is history > the overarching consideration? I suggest that it preferably should be > getting the best typography possible with the ASCII character set. That goal doesn't appear to bring us much closer to a decision: While many fonts today clearly represent U+0060 as a grave accent, some traditional fonts continue to support the usage of ASCII 0x60 as an opening single quote, and some members of this list clearly stated that they like such fonts. So for them, rendering \(oq as ASCII 0x60 actually results in *better* typography than rendering it as ASCII 0x27 APOSTROPHE-QUOTE. However, i think that even Doug, Werner, and Ralph will have to admit that from a typographical standpoint, such use of the ASCII 0x60 output glyph is highly non-portable nowdays, and according to RFC 20, it already used to be non-portable (and discouraged for international use) in 1969. So i think it is fair to make my wording "modernise" more precise as follows: "Stamp out US-specific, internationally non-portable usage of ASCII that is incompatible with Unicode, because nowadays, using ASCII in a way that is compatible with Unicode is more important than preserving historical -T ascii rendering practice and more important than rendering historical documents unchanged that incorrectly encode ASCII 0x60 (for example for use in m4) as \(oq rather than as \(ga." Is that something we can agree on? Yours, Ingo