Hi Ted, > Mike Bianchi wrote: > > The thing I _like_ about the *nix OSs is they don't demand I > > upconvert just because a "better way" comes along.
Thompson and Pike knew that when designing UTF-8; it's a superset of ASCII and ensures a zero byte only means NUL so C strings continue as before. > I completely agree with Mike! Of course it would be a good thing to > *extend* groff's capabilities so that it can cope (optionally) with > recent developments, but in my view it *must* keep its original > capabilities, and those that have evolved since (say) the 1980s (which > is where many of my own troff source files date back to). Isn't it groff's evolution that's the problem here? Bell Labs troff took ASCII, i.e. 7-bit. groff added ISO-8859-1 support, another ASCII superset, that was still one byte per rune but used 96 of the top-bit-set bytes for more runes. UTF-8 comes along and groff can't adopt it because it's already taken an incompatible fork. IIRC Bell Labs plough on with Plan 9's troff taking UTF-8. How many of our old documents are ISO-8859-1 instead of ASCII? Could we wind back the clock and make UTF-8, and thus ASCII, groff's input, with ISO-8859-1 being the runt input character set that needs options and hoops to jump through? -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy