Please have a look at

  https://bugs.ghostscript.com/show_bug.cgi?id=708234

and the thread on the 'luatex' mailing list starting with

  https://tug.org/pipermail/luatex/2025-January/008035.html

which discusses the problem that luatex creates invalid PDF outlines.
While some PDF viewers like `evince` or `okular` are not affected,
ghostscript (as a post-processor) and Acrobat (at least as a viewer)
actually are.

Luigi wrote to me the following in a private e-mail:

> It's a bug in luatex, [...]  Inside [the demo PDF] we have [stuff
> like]
>
> ```
> 44848 0 obj
> << /Names [ (Gregorian\040accidentals\040and\040key\040signatures) 
> [...]
> ```
>
> These strings are correct, but luatex sees '\','0','4','0' instead
> of ' ' because this is the string that texinfo.tex encodes:
>
> ```
> % Escape PDF strings without converting
> \begingroup
>   \directlua{
>     function PDFescstr(str)
>       for c in string.bytes(str) do
>         if c <= 0x20 or c >= 0x80 or c == 0x28 or c == 0x29 or c == 0x5c then
>            tex.sprint(-2,
>              string.format(string.char(0x5c) .. string.char(0x25) .. '03o',
>                            c))
>         else
>           tex.sprint(-2, string.char(c))
>         end
>       end
>     end
>   }
> ```
>
> So 0x20 is translated as '\\040', i.e., 4 bytes instead of ' ',
> 1 byte, and strcmp correctly sorts 4 bytes and this is the bug:
> luatex has to unscape these strings before the sorting.  [...]
>
> is it necessary to escape the space 0x20?  Perhaps no, so in the
> lua function it could be c<0x20 instead of c<=0x20.  This could be
> an enhancement (and probably it masks the bug [...])

So: can this be fixed on the side of `texinfo.tex` at least for the
space character, circumventing the luatex bug for the most common
case?

Luigi also writes:

> Anyway, the pdf spec says that it's possible to use these escape
> sequences:
>
> ```
> \n LINE FEED (0Ah)  (LF)
> \r CARRIAGE RETURN (0Dh)  (CR)
> \t HORIZONTAL TAB (09h) (HT)
> \b BACKSPACE (08h) (BS)
> \f FORM FEED (FF)
> \( LEFT PARENTHESIS (28h)
> \) RIGHT PARENTHESIS (29h)
> \\ REVERSE SOLIDUS (5Ch) (Backslash)
> ```

Using these escape sequences instead of the `\ddd` notation would
avoid the bug for some other characters, too.


    Werner

Reply via email to