Hi folks, This was discussed three years ago and, hearing no objection at the time, I'm working on it.
At 2023-03-26T01:15:27-0500, G. Branden Robinson wrote:
[...]
> My net takeaway from this is that it is indeed better to keep
> hyphenation overrides within individual man pages. But maybe the only
> way to know how tedious this really is to see how much a large,
> practical corpus of man pages, like the Linux man-pages, requires it.
[...]
> One of the themes of my suggested revisions to GNU troff has been to
> provide ways to unwind or reset things that historically haven't been
> available. One of those is environment removal (Savannah #60954).
>
> Another that has occurred to me is hyphenation override removal.
>
> Today, invoking the `hw` request without arguments does nothing. We
> could change it to clear any existing hyphenation overrides.
>
> Or, perhaps better, we could add an 'hwrm' or 'rhw' request; if given
> arguments, it reads each word (ignoring hyphens), matches it against
> the existing list of overrides, and removes the word if found. If
> given no arguments, it removes all overrides. Then, an.tmac (and
> doc.tmac) could call it when hitting `TH` (and `Dd`) macros, tidying
> up the state of the formatter for the next document.
Nota bene: the only hyphenation exception words removable with `rhw`
will be those added with `hw`. Those populated by language-specific
hyphenation pattern exception files like "tmac/hyphenex.{cs,en,pl}" will
be immune to such removal.
I feel more strongly now that `rhw` is the best name for this request;
it's consistent with `rchar`, `rfschar`, and resembles `rr` and `rm`.
Thoughts? Objections?
Implementation sketch:
diff --git a/src/roff/troff/dictionary.cpp b/src/roff/troff/dictionary.cpp
index 4aebc1f92..34691aa74 100644
--- a/src/roff/troff/dictionary.cpp
+++ b/src/roff/troff/dictionary.cpp
@@ -134,7 +134,8 @@ bool dictionary_iterator::get(symbol *sp, void **vp)
for (; i < dict->size; i++)
if (dict->table[i].v) {
*sp = dict->table[i].s;
- *vp = dict->table[i].v;
+ if (vp != 0 /* nullptr */)
+ *vp = dict->table[i].v;
i++;
return true;
}
diff --git a/src/roff/troff/env.cpp b/src/roff/troff/env.cpp
index 1c9ef9b02..17629cb9a 100644
--- a/src/roff/troff/env.cpp
+++ b/src/roff/troff/env.cpp
@@ -3953,6 +3953,34 @@ static void add_hyphenation_exception_words_request() //
.hw
skip_line();
}
+static void remove_hyphenation_exception_words_request() // .rhw
+{
+ if (0 /* nullptr */ == current_language) {
+ error("cannot remove hyphenation exception words when no"
+ " hyphenation language is selected");
+ skip_line();
+ return;
+ }
+ dictionary_iterator iter(current_language->exceptions);
+ symbol entry;
+ if (!has_arg()) {
+ debug("GBR: remove_hyphenation_exception_words_request(): nuking 'em all");
+ while (iter.get(&entry, 0 /* nullptr */)) {
+ assert(!entry.is_null());
+ // The exception word symbol's contents contains a space if it's
+ // _not_ user-defined. Kind of kludgy, but possibly not worth
+ // fixing without also migrating to an STL unordered_map or
+ // similar, and using a `struct` with a string and a `bool` in it
+ // as the values.
+ if (!strchr(entry.contents(), ' ')) {
+ debug("GBR: nuking '%1'", entry.contents());
+ current_language->exceptions.remove(entry.contents());
+ }
+ }
+ }
+ skip_line();
+}
+
static void print_hyphenation_exceptions_request() // .phw
{
if (0 /* nullptr */ == current_language) {
@@ -4695,6 +4723,7 @@ void init_hyphenation_pattern_requests()
init_request("hpfa", append_hyphenation_patterns_from_file_request);
init_request("hw", add_hyphenation_exception_words_request);
init_request("phw", print_hyphenation_exceptions_request);
+ init_request("rhw", remove_hyphenation_exception_words_request);
}
// Local Variables:
Regards,
Branden
signature.asc
Description: PGP signature
