Hi,

On 2026-04-10 at 07:52-05:00, G. Branden Robinson wrote:
> At 2026-04-07T17:58:00+09:00, Nguyễn Gia Phong wrote:
> > Hi, I wonder if you've seen this patch.  Somehow <[email protected]>
> > is not forwarding replies to me and I have not been able to access
> > <https://lists.gnu.org> for a while now.
>
> The GNU list archive site is under persistent DDoS attack by LLM intake
> crawlers.
>
> See, for example:
>
> https://lists.gnu.org/archive/html/savannah-hackers-public/2026-04/msg00000.html
>
> and follow-ups.

Thanks!

On 2026-04-10 at 07:52-05:00, G. Branden Robinson wrote:
> At 2026-03-26T16:32:15+0900, Nguyễn Gia Phong wrote:
>> At 2026-03-26T00:19:48-0500, G. Branden Robinson wrote:
>> > At 2026-03-26T13:55:31+0900, Nguyễn Gia Phong wrote
>> > > On the other hand, \[uNNNN] does not work for MathML output.
>> > 
>> > Wondering generally to the development and user community:
>> > 
>> > ...maybe it should?
>> 
>> I thought it would be safe to make it work,
>> as the current behavior is just outputing an error:
>> 
>> <merror>unknown eqn/troff special char uNNNN</merror>
>
> ...but it works when the implementation supports the relevant XML
> character entities, right?

No, it's groff that currently outputing that
for any special character not found in the entity_table.
A patch to make any Unicode code point work would look
something like the following (whitespace-insensitive diff
for concision).

   else if (output_format == mathml) {
+    const char *unicode_code_point = valid_unicode_code_sequence(s);
+    if (unicode_code_point != NULL) {
+      printf("<mo>&#x%s;</mo>", unicode_code_point);
+    } else {
       const char *entity = special_to_entity(s);
       if (entity != NULL)
         printf("<mo>%s</mo>", entity);
       else
         printf("<merror>unknown eqn/troff special char %s</merror>", s);
+    }
   }

Now, if eqn instead always output (numbered) character instead
of (named) entity references, we have the v2 patch mentioned earlier:

   else if (output_format == mathml) {
-    const char *entity = special_to_entity(s);
-    if (entity != NULL)
-      printf("<mo>%s</mo>", entity);
+    const char *unicode_code_point = valid_unicode_code_sequence(s);
+    if (unicode_code_point == NULL)
+      unicode_code_point = glyph_name_to_unicode(s);
+    if (unicode_code_point != NULL)
+      printf("<mo>&#x%s;</mo>", unicode_code_point);
     else
       printf("<merror>unknown eqn/troff special char %s</merror>", s);
   }

On 2026-03-25 at 20:45:23+0900, Nguyễn Gia Phong wrote:
> I'm using eqn
> to generate my personal website and in Atom web feeds, only four named
> entities (lt gt amp quot) are widely supported, while numerical
> character references(&#...;) always work.

I think this piece of documentation
from the W3C explains it better than I did:
<https://validator.w3.org/feed/docs/error/UndefinedNamedEntity.html>

In other words, the XML specs only define entities for <&>"
and anything else require the entity to be defined
in the DOCTYPE declaration.  Any conforming implementation
would not be able to recognize, say, &mapsto; unless there is
a <!ENTITY mapsto "#x21a6;"> defined either in the same document
or in a DTD that is linked there.  For XHTML, the following
would work

<!DOCTYPE html
    PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN"
           "http://www.w3.org/Math/DTD/mathml2/xhtml-math11-f.dtd";>

but Atom does not have a DTD so it's impossible
to construct a doctype for atom-math.

Kind regards,
Phong

  • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
    • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
      • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
        • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... G. Branden Robinson
            • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
            • ... G. Branden Robinson
              • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software

Reply via email to