On Thursday, March 28, 2019 3:01 AM, G. Branden Robinson wrote: > At 2019-03-27T04:34:18+0000, Jeff Conrad wrote: > > Is there a reason that tty.tmac translates \(bu to \(pc or \(md > > regardless of the output device or whether \(bu is available? > > > > .ie c\[pc] \ > > . tr \[bu]\[pc] > > .el \ > > . if c\[md] \ > > . tr \[bu]\[md] > > Are you looking at an old implementation? There's some important > context missing here:
Yep-I'm using 1.22.3. Running Windows, I've had to diddle a few things, so the upgrade isn't as simple as it could be. > $ nl /usr/share/groff/1.22.4/tmac/tty.tmac | sed -n '14,21p' > 14 .if !'\*[.T]'utf8' \{\ > 15 . ie c\[pc] \ > 16 . tr \[bu]\[pc] > 17 . el \ > 18 . if c\[md] \ > 19 . tr \[bu]\[md] > 20 .\} > 21 . > > It sure seems like you might be re-reporting a problem Carsten Kunze > raised in June 2015, and which prompted Werner to wrap the conditional > you mention in an "if device is not UTF-8" block: > https://lists.gnu.org/archive/html/groff/2015-06/msg00040.html Again, yep-I used the wrong search query ... > Really we shouldn't be conditional on UTF-8 per se, but on the existence > of the bullet glyph in the font for the tty device. Completely agree. > However, the tty device ignores fonts ... > these devices can report their character repertoire up to an > application. VGA-style console devices, framebuffer consoles, and GUI > terminal emulators can even change these on the fly. (Who else > remembers live-hacking the display font in MS-DOS?) We're obviously at the mercy of the chosen font (on Windows, I use Lucida Console as the best of very limited options). But the device at least gives us a reasonable idea of what's possible. > So Werner's fix worked because there were (and are) no nroff/tty devices > in the groff tree that supported the bullet character _except_ -Tutf8. > > My recommendations are: > 1) Upgrade to groff 1.22.4; and > 2) Change the conditional on line 14 of tty.tmac from: > > 14 .if !'\*[.T]'utf8' \{\ > > to: > > 14 .if !c\[bu] \{\ > > ...and tell us if that fixes your problem. Making this change (which I've already done) indeed fixes things. > Personally, I advocate incorporating cp1252 into groff. It's only an > 8-bit character set, should therefore be a low maintenance burden, and > really should make life a bit more bearable for groff's Windows users. > And that's good PR for groff, GNU, copyleft, and Free Software. It's yours for the asking; it's really just latin1 with the additional characters that Microsoft added to the C1 area. I went a bit further and added spelled-out representations of missing Greek characters (I hate missing symbols; in the old, old days, I guess one would print the document and write in the missing symbols. Yeah, right ...). But if these additions aren't for everyone, they're easily deleted. > > Even for Tlatin1, I'd prefer an asterisk or even the age-old > > overstruck '+' and 'o'. Isn't the general rule for nroff to make the > > best possible visual approximation when the true character isn't > > available? > > As noted above, knowing what will actually show up on the output > device is, in principle, impossible for nroff/tty output devices. The user needs to pick the most appropriate font; there don't seem to be all that many choices that we need to worry about. > However, we can generally assume that users of 8-bit encodings will > have comprehensive fonts available by default--they'd have to go out > of their way to avoid them. But 8-bit encodings (e.g., ISO 8859) have their limitations; in particular, they're missing most of the common punctuation characters used in typesetting. The MS extensions addressed most of this. > Life is harder in UTF-8 world. Yep. Especially on Windows. I had to hack the devutf8 font files to use U+002D rather than U+2010 for a hyphen, because Lucida Console doesn't include the latter. Ya do what ya gotta do ... But Microsoft are working on it ... https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and- utf-8-output-text-buffer/ Skip to "Are we there yet?" near the end if you're less than fascinated with the topic. > To get that asterisk: > > In your documents, or your .troffrc, could you not do this? > > .fchar \[bu] * Yes. I've already done something similar. But this won't help with the few files I generate for general distribution. For example, for GNU units, we generate a man page from texinfo source with a perl script, and obviously can't assume a customized .troffrc-so we include a few hacks to override some groff settings (e.g., ".tr \(oq'"). We actually don't even assume groff, so we try to cover all the bases; this probably is overkill nowadays. > As a minor point, I do think the existing fallback should be reversed in > order: > > From: > > .fchar \[bu] \z+o > > To: > > .fchar \[bu] \zo+ Interesting how we differ on this. I don't like either alternative, but find the 'o' more instantly recognizable-it's sorta kinda a circle. As I recall, the AT&T version 2 nterm files that I had in the late 1980s had it as you suggest, and I reversed it. I guess it's a matter of personal preference. The asterisk avoids the problem. > The \z+o status quo seems to follow a pattern that makes sense for > modified letterforms, i.e., \z'a; on a 7-bit ASCII, non-overstriking > device, you want the "a" to "win", because it carries the more important > semantic information. In general, I completely agree. > That reasoning does not hold for bullet substitutes, which simply need > to stand out graphically (your argument for not using a middle dot or > centered period, which may be as small as one pixel on some devices), > and not be semantically confusable with text. In this circumstance, I don't know whether we can really separate graphics and semantics. > As "o" is actually a word (even in English, though much more prominently > in Spanish), I find the present arrangement unfortunate. I think it's largely a matter of context. As the tag for a list, I think confusion would be unlikely. And again, an asterisk-perhaps ugly but arguably the most common ASCII approximation of a bullet-would seem to avoid the problem. In my senior year of high school, I had an English teacher-a PhD-who tried to drill into us that the "best" English is that which provides the maximum communication (and it generally avoids pompous polysyllabic pronouncements). I suggest something similar for the "best" groff. Of course, it's not always easy to reach consensus on the details. Regards, Jeff