Hi Werner,
On Wed, 2025-09-10 at 06:00 +0000, Werner LEMBERG wrote:
> > Can you try again but with "table.serialize" replaced with [...]
>
> Thanks! [It took me a while to find out that I have to use the
> `--luadebug` command-line option so that `debug.getinfo` is defined.]
Whoops, sorry about that.
> The diff is attached (compressed this time, calling `diff` on sorted
> input).
Ok, you should be safe to ignore anything with "rawdata" or "unscaled"
in its name, which just leaves the
"resources.sequences[i].steps[1].coverage" stuff.
> I think everything boils down to the question whether LuaTeX by
> default loads a font internally with
>
> ```
> f = fontloader.open("EBGaramond-Regular.otf")
> fonttable = fontloader.to_table(f)
> ```
No, nothing (that I'm aware of) uses the builtin "fontloader" library;
the current parser is 100% Lua.
> then converting `fonttable` to a `tfmdata` structure. I want to
> access `fonttable` before this step happens. How can I do that?
The current Lua font loading code essentially goes directly from the
binary font files to the "tfmdata" table. Specifically, mark-to-base is
handled by lines 1906--1908 of "fontloader-font-dsp.lua".
> > With the caveat that this depends on internal implementation
> > details, and is therefore unsupported and could change at any time,
> > "fonts.handlers.otf.readers.loadfont" is the earliest point that you
> > can modify the font data: [...]
>
> Thanks, but again, this code is manipulating `tfmdata` AFAICS, so
> there is no advantage w.r.t. compactness of the 'mark2base' lookup
> data.
If you look at the function that "fonts.handlers.otf.readers.loadfont"
call internally ("loadfontdata", "fontloader-font-otr.lua" lines
2240--2306), you can see that the function is parsing the font data
directly ("readulong" and any of the "*cardinal*" functions are for
parsing binary data), so there's really nowhere earlier that you can
hook into.
> I don't think so. In `tfmdata`, the data from the 'mark2base' lookup
> is no longer represented in a compact form. Instead, it is expanded
> so that each accent glyph has all the necessary deltas for all base
> glyphs. While this speeds up the processing of the 'mark2base'
> lookup, it makes manipulation much more complicated.
Internally, the code directly adds the data in the expanded form by
looping over the characters. So since there aren't any helper functions,
you'll have to loop over all the characters yourself. Which I agree is
annoying, but even if Hans were to add a helper function for this, the
LaTeX team has stopped importing new font loader code, so you wouldn't
actually be able to use it.
Thanks,
-- Max