Hi Branden, On Wed Nov 13, 2024 at 7:25 PM CET, G. Branden Robinson wrote: > [...] > > i.e. translation should happen on output, not on input, > > I'm not sure I agree with that, given the above. When I see `tr` used, > it is typically to make input more convenient.
I never said it's not used like that. I just meant to say that groff(7) suggests the translation happens at the moment the character is formatted for output rather than at the moment it is read in: .tr abcd... Translate ordinary or special characters a to b, c to d, and so on PRIOR TO OUTPUT. [emphasis added] which is why I wondered about the things you quote below. > [...] > > meaning that using .hla might not be sufficient to switch between cs > > and fr, because that doesn't switch the encoding used. > > I'll have to think about this. It might not matter in the > wide-character-type/UTF-8-reading GNU troff future. > > While I don't have an ETA for that, I don't want to complicate the > formatter itself with any features to make eight-bit encodings more > convenient to use. That feels like throwing good money after bad. > UTF-8 is the future. Heck, it's the present, most places. I think if anything, this thread demonstrates the complexity that arises from using multiple character encodings. I was just trying to make it work that way because that's what we have now, but it would obviously be much better if one could use UTF-8 directly in the hyphenation files (or at least the \[u...] characters) without having to jump through all these hoops. > [...] > > groff(7) does mention it, but it's among the last things mentioned in > > the Hyphenation section. The texinfo manual doesn't mention it at all > > in its section 5.1.3 about Hyphenation where I would expect it. (At > > least the online version -- I haven't found any git source for it, > > just tarballs.) > > You can review up-to-date documentation here: > > https://www.dropbox.com/sh/17ftu3z31couf07/AAC_9kq0ZA-Ra2ZhmZFWlLuva?dl=0 > > The Git source for the bleeding edge of our documentation is at: > > https://git.savannah.gnu.org/cgit/groff.git/tree/doc/ > https://git.savannah.gnu.org/cgit/groff.git/tree/man/ Thanks; I overlooked the texinfo source in the doc/ directory. I don't notice any changes to the hyphenation-related sections that would make it obvious one should load the appropriate localization files rather than do it 'by hand' (i.e. by using .hpf etc.), though. (By the way, that Dropbox PDF viewer is borderline unusable and downloading the PDF requires logging in. If you ever need something less bloated, I recommend <https://paste.c-net.org>.) > > [...] > > Of course, this wouldn't be necessary if .hy worked like .ad, > > That's actually a bad example, but a very popular misconception. You > probably mean "if .hy worked like .ps". Or .ft, .ev, .in, .ll, > .ls, .lt, .po, or .vs;, or groff's .fam, .fcolor, .gcolor, or .pvs. > > Without an argument, neither .hy nor .ad restore the "previous" > hypenation mode or adjustment setting, respectively. That's not a bad example, you just misunderstood. I know .ad without argument doesn't restore previous adjustment mode; it caused me some headaches in the past. I eventually realized that .ad is not meant to switch back-and-forth between adjustment modes, but to restore adjustment after it was disabled with .na. What I was saying above is that if .hy worked in this way too, i.e. if .hy without arguments restored hyphenation after .nh was called, the macro I proposed wouldn't be necessary. > [...] > I think these are horrible warts in the *roff language that an > iconoclast should have smashed years ago. But they work fine for the > most common cases (temporary disablement with `nh` and `na`, > respectively) [...] I would disagree it works fins for temporary disablement with .nh; see above. > > but (unless I am mistaken again :) it doesn't and cannot due to > > desired compatibility with AT&T troff. > > You might be interested in a feature in the forthcoming groff 1.24.0: > > NEWS: > * A new request, `hydefault`, and read-only register, `.hydefault`, > manage the default automatic hyphenation mode of an environment. > This resolves a long-standing problem of *roff formatting. > > When processing input like this, > .nh > and we temporarily shut off automatic hyphenation, > .hy > the foregoing request would not do exactly what we expect. > > AT&T and other troffs would set the hyphenation mode to 1 instead of > the previous value; for GNU troff this was not an appropriate value > for the English hyphenation patterns. (For example, "alibi" would > break as "ali-bi" instead of "al-ibi" after this argumentless `hy` > invocation.) With updates to groff's localization files, the > foregoing input now works as desired. Sounds like what .hy should have been doing from the beginning :) > I have plans to fix the argumentless `ad` request, but just today I > decided to kick that out past 1.24. > > https://savannah.gnu.org/bugs/?65954 I don't feel like this fixes anything, honestly. Before this, I could do: .ad r Lorem ipsum dolor sit amet... .br .na Lorem ipsum. .br .ad Lorem ipsum dolor sit amet... and couldn't do: .ad r Lorem ipsum dolor sit amet... .br .ad c Lorem ipsum. .br .ad Lorem ipsum dolor sit amet... Now I will not be able to do either. I suggest this instead: .ad Set adjustment mode to \n[.J] if set, b otherwise. .ad 0 Disable adjustment. Update \n[.j] and \n[.J] (previous value of \n[.j]). .ad MODE Set adjustment mode to MODE (l,c,r,b,n). Update \n[.j] and \n[.J]. .na As .ad 0. This should make both scenarios work as expected without breaking any other ways in which people currently use it. (At least I hope so.) ~ onf