Hi Branden, G. Branden Robinson wrote on Sun, Nov 01, 2020 at 12:27:36AM +1100: > At 2020-10-31T13:55:08+0100, Ingo Schwarze wrote:
>> Please take man.local out of the equation. Such a thing simply >> doesn't exist on FreeBSD, OpenBSD, NetBSD, or Dragonfly, and it >> won't be created on OpenBSD. > groff ships it. What do you do with it? Install it where the default groff install puts it, which happens to be /usr/local/share/groff/site-tmac/, on a filesystem where even the system administrator is not supposed to edit files. A file that is empty and not supposed to be edited is equivalent to not existing for all practical purposes. >>> 3. Restore the remappings to tmac/an-old.tmac, but make them >>> conditional on a register that defaults off. >>> 4. Restore the remappings to tmac/an-old.tmac, but make them >>> conditional on a register that defaults on. >> OS-local changes solve little in the first place. > What's OS-local about > > .if r SOMEREG \ > . if \n[SOMEREG] \{\ > . char ... ... > . char ... ... > . \} > > ? Surely you will not expect users to set or unset SOMEREG, and not even system administrators. Apart form not being to be expected, that would also make no sense, because which value of SOMEREG is needed depends on which manual page if being rendered, not on the preferences of users or system administrators. So SOMEREG will be whatever the distribution or packaging makes it. Which means that at the end of the day, options 3 and 4 are exactly the same as options 1 and 2, there is no practical difference whatsoever. Once that situation exists, some authors will write manual pages for one value of SOMEREG, some for another, and we end up with large numbers of manual pages being non-portable. >> Manual pages are supposed to be as operating-system independent as >> possible, such that pages from one system can also be read on another, >> and such that authors of portable software know what to do. You seem >> to be advocating ecosystem fragmentation, making manual pages >> non-portable, which astonishes me. > That argument is precisely equally true of supporting a configurable LL > register. The world has shattered into N pieces, where N is every > terminal width in use. Do you yearn for the days of 110 baud Teletypes > with nroff grinding out the reference pages at 65n width? Most real-world manual pages look best at the default of LL=78n, but rendering doesn't usually become incorrect for values like LL=60n or LL=132n. So no, it's not the same. LL is merely unimportant. By contrast, using a value of SOMEREG that mismatches the page being formatted causes misrendering. >> I have no preference among options 1 to 4. >> They are all equally bad. > This is an absurd claim. I don't think you have understood what > options 3 and 4 entail. Not sure what you mean; what is wrong with my above reasoning? Isn't the above what will happen as a consequence of options 3 or 4? >> By the way, the problem is not only changing thousands of existing >> manual pages - which can't be automated; every single instance of ' >> and ` would have to be checked manually. Here is a list of *a few >> examples* of affected manual pages from OpenBSD base alone. The >> list is definitely far from comprehensive, these are just some >> examples: >> >> section 1: bc, csplit, expr, find, flex, getopt, grep, ksh, ldap, >> less, mandoc, more, paste, pax, shar, ssh, su, tar, tmux, vi, >> xargs >> >> section 3: BIO_f_ssl, EVP_PKEY_keygen, RMD160Init, SHA256Init, >> SSL_CTX_set_alpn_select_cb, SSL_CTX_set_default_passwd_cb, >> cgetent, fgetln, fgets, getopt, getopt_long, malloc, stpcpy, >> strchr, strcspn, strncat, strncpy, strrchr, strsep, strtol, >> strtoul, va_start, wcslcpy, wcsrchr, wprintf >> >> section 5: ifstated.conf, nsd.conf, pf.conf, pf.os, relayd.conf >> >> section 7: ascii, ports, roff > You haven't shared your search method, Oh, that's simple: i grepped for ' and ` and quickly looked for candidates that would obviously come out wrong, without wasting much time on it. > so I can't infer very much from > this. Backticks and apostrophes used as quote characters tend to abut > whitespace; backticks are unknown in prose, Indeed, changing backticks to \(ga could probably be achieved by grepping for them and checking all instances. Still, it would be a bit tedious, they occur in about 2850 base system manual pages installed on OpenBSD. > and apostrophes are invariably within word boundaries in English > except in dialect registers that are unheard of in man pages. > > Don' go messin' 'roun' wit' the -r and -f flag on rm if'n ya > knows what's good fer ya. > > This sort of thing is not seen. > > I don't doubt that there are a lot of wrongly-encoded "quotes" in > OpenBSD man pages, however. Apostrophes are harder to filter than backticks, they appear in large numbers of different ways. Even in code samples, they tend to abut whitespace. I see them in about 3250 manual page files, on about 62,000 lines, i.e. 20 lines on average per manual page, in OpenBSD base. Which of these need to be changed? Some filtering is likely possible, but many will need to be checked by hand. How long did it take you to fix groff's 60 manual pages? Care to multiply that by 50? And then by 4 or 5 for the number of BSD operating systems? And then you haven't even started with portable software, or GNU/Linux. And developers will come along and commit new manual page content all the time. >> The problem is that manual pages are written by software developers, >> not by typesetters, who are used to typing programming languages >> and who are used to the fact, from the past, that these five >> characters do not need escaping. > They did in 2008, and every year before that, for people using -Tutf8. I know that UTF-8 is very significantly older as an invention but the reason the fix was only committed around 2008 is likely that around that time, people may have started caring more about UTF-8 output from manual pages. The change made an output mode of increasing popularity better for the input that was usual in manual pages. It doesn't mean that programmers were used to escaping these characters before that, as you see from the fact how many pages still don't escape. Quite to the contrary, if escaping had been common before that, the change would not have made much sense. I still consider it very likely that most programmers are used to not escaping these characters, today just like between 1990 and 2008, no matter whether they wrote their first manual page before or after 2008, certainly with the exception of some who had previous exposure of typography or ventured to study documentation with unusual diligence. Yours, Ingo