Hi Phong,

At 2026-04-29T11:56:58+0900, Nguyễn Gia Phong wrote:
> On 2026-03-26 at 13:48+09:00, Nguyễn Gia Phong wrote:
> > XML only defines four entities (< > & ") out of the box,
> > others need to be declared in the document's DOCTYPE.
> > For web feeds such as RSS and Atom, is is particularly cumbersome
> > to define the math entities as these feeds are supposed
> > to be stand-alone and thus the entity definitions have to be inlined.
> >
> > Therefore, character references are now used
> > instead of entity references, making the MathML output
> > directly embeddable into these feeds.  The entity table
> > is no longer used and thus removed.
> >
> > * src/preproc/eqn/text.cpp: Remove struct map, entity_table,
> >   and special_to_entity.  Include "unicode.h" header file.
> >   (special_char_box::output): Instead of named entity reference,
> >   print XML character reference with Unicode codepoint for MathML.
> >   Add support for Unicode code sequence as an input character.
> >
> > References: https://www.w3.org/TR/REC-xml/#sec-references
> 
> Gentle ping!

Thanks for the reminder.  This message of yours:

https://lists.gnu.org/archive/html/groff/2026-04/msg00011.html

...was sufficient to make me notice a problem.  Thank you for the simple
reproducer!  You've dug more deeply into the nature of the issue than
I have; I've hacked on GNU eqn a bit but less than the formatter or tbl.

But I agree that this:

$ printf '.EQ\napprox\n.EN\n' | ./eqn -TMathML
.do if !dEQ .ds EQ
.do if !dEN .ds EN
.EQ
<math><mtext>\(~=</mtext></math>
.EN

...is a smoking gun of wrongness, and I can reproduce it.  I'm not
surprised to observe it in every groff release from 1.22.3 forward, and
I'd guess it's been a problem with GNU eqn's MathML mode "forever".

I wonder if this issue has (nearly[1]) the same cause as an existing
Savannah ticket.

https://savannah.gnu.org/bugs/?66592

Could you review that and confirm or refute?

If they're the same problem, then if I understand your analysis
correctly, you're probably on the right track: the special character
rewriting table that GNU eqn uses for MathML mode is flat wrong, copying
or reusing the one for troff output.

Regards,
Branden

[1] The issue in #66592 seems to be more that "left" and "right" get
    discarded and "floor" isn't mapped to a character entity at all,
    maybe because the input parser screws up "left" and "right" handling
    in a MathML-specific way.

Attachment: signature.asc
Description: PGP signature

  • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
    • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
      • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
        • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... G. Branden Robinson
            • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
          • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software
            • ... G. Branden Robinson
              • ... Nguyễn Gia Phong via discussion of the GNU roff typesetting system and related software

Reply via email to