Hi folks,

This was discussed three years ago and, hearing no objection at the
time, I'm working on it.

At 2023-03-26T01:15:27-0500, G. Branden Robinson wrote:
[...]
> My net takeaway from this is that it is indeed better to keep
> hyphenation overrides within individual man pages.  But maybe the only
> way to know how tedious this really is to see how much a large,
> practical corpus of man pages, like the Linux man-pages, requires it.
[...]
> One of the themes of my suggested revisions to GNU troff has been to
> provide ways to unwind or reset things that historically haven't been
> available.  One of those is environment removal (Savannah #60954).
> 
> Another that has occurred to me is hyphenation override removal.
> 
> Today, invoking the `hw` request without arguments does nothing.  We
> could change it to clear any existing hyphenation overrides.
> 
> Or, perhaps better, we could add an 'hwrm' or 'rhw' request; if given
> arguments, it reads each word (ignoring hyphens), matches it against
> the existing list of overrides, and removes the word if found.  If
> given no arguments, it removes all overrides.  Then, an.tmac (and
> doc.tmac) could call it when hitting `TH` (and `Dd`) macros, tidying
> up the state of the formatter for the next document.

Nota bene: the only hyphenation exception words removable with `rhw`
will be those added with `hw`.  Those populated by language-specific
hyphenation pattern exception files like "tmac/hyphenex.{cs,en,pl}" will
be immune to such removal.

I feel more strongly now that `rhw` is the best name for this request;
it's consistent with `rchar`, `rfschar`, and resembles `rr` and `rm`.

Thoughts?  Objections?

Implementation sketch:

diff --git a/src/roff/troff/dictionary.cpp b/src/roff/troff/dictionary.cpp
index 4aebc1f92..34691aa74 100644
--- a/src/roff/troff/dictionary.cpp
+++ b/src/roff/troff/dictionary.cpp
@@ -134,7 +134,8 @@ bool dictionary_iterator::get(symbol *sp, void **vp)
   for (; i < dict->size; i++)
     if (dict->table[i].v) {
       *sp = dict->table[i].s;
-      *vp = dict->table[i].v;
+      if (vp != 0 /* nullptr */)
+       *vp = dict->table[i].v;
       i++;
       return true;
     }
diff --git a/src/roff/troff/env.cpp b/src/roff/troff/env.cpp
index 1c9ef9b02..17629cb9a 100644
--- a/src/roff/troff/env.cpp
+++ b/src/roff/troff/env.cpp
@@ -3953,6 +3953,34 @@ static void add_hyphenation_exception_words_request() // 
.hw
   skip_line();
 }
 
+static void remove_hyphenation_exception_words_request() // .rhw
+{
+  if (0 /* nullptr */ == current_language) {
+    error("cannot remove hyphenation exception words when no"
+         " hyphenation language is selected");
+    skip_line();
+    return;
+  }
+  dictionary_iterator iter(current_language->exceptions);
+  symbol entry;
+  if (!has_arg()) {
+    debug("GBR: remove_hyphenation_exception_words_request(): nuking 'em all");
+    while (iter.get(&entry, 0 /* nullptr */)) {
+      assert(!entry.is_null());
+      // The exception word symbol's contents contains a space if it's
+      // _not_ user-defined.  Kind of kludgy, but possibly not worth
+      // fixing without also migrating to an STL unordered_map or
+      // similar, and using a `struct` with a string and a `bool` in it
+      // as the values.
+      if (!strchr(entry.contents(), ' ')) {
+       debug("GBR: nuking '%1'", entry.contents());
+       current_language->exceptions.remove(entry.contents());
+      }
+    }
+  }
+  skip_line();
+}
+
 static void print_hyphenation_exceptions_request() // .phw
 {
   if (0 /* nullptr */ == current_language) {
@@ -4695,6 +4723,7 @@ void init_hyphenation_pattern_requests()
   init_request("hpfa", append_hyphenation_patterns_from_file_request);
   init_request("hw", add_hyphenation_exception_words_request);
   init_request("phw", print_hyphenation_exceptions_request);
+  init_request("rhw", remove_hyphenation_exception_words_request);
 }
 
 // Local Variables:

Regards,
Branden

Attachment: signature.asc
Description: PGP signature

Reply via email to