Hi Branden, On Fri, Feb 16, 2024 at 10:49:52AM -0600, G. Branden Robinson wrote: > Hi Alex,
[...] > Right. I had a similar desire when I first came to groff development. [...] > It's not that it's a bad idea, it's that it's hard. > > Please excuse an excursion into parser theory, presented by a stumbling > amateur in the field. > > The way a *roff parser works--_any_ *roff parser I know of except > possibly mandoc(1)'s--makes this nearly intractable. [...] > 20/(3 + 1) [...] > ref a 3 > ref b 1 > 20 (reduce -> PUSH) > deref a > 3 (reduce -> PUSH) > deref b > 1 (reduce -> PUSH) > + (reduce -> PUSH) > / (reduce -> PUSH) > pop > pop > ref c %accum (notation for a memory location like 0x8000 where our > language runtime keeps a copy of the top of the stack for > values of interest) > > This will produce much the same machine language. > > PUSH 20 > PUSH 3 > PUSH 1 > ADD (stack now looks like 20, 4) > DIV (stack now looks like 5) > POP > STORE $8000 (i.e., write 5 to memory at 0x8000) > > Okay. _Now_ can you reconstruct the original source from these machine > instructions? No. The evidence of the use of variables has been > effaced; we don't even know what their names are. > > And so it is with the roff languages. An arbitrary number of macro > expansions may have taken place. Hmmm. But maybe approximating by saying "the original equation had something like '5'" could decent enough? It's not really what it was, but at least it means the same thing (hopefully). Or by the moment you're about to produce the trout you don't have anything that can resemble valid roff(7) anymore? > I'm not saying it is impossible to do what you want (or what I wanted, > years ago). By going carefully through the parser and adding "hooks" to > preserve input tokens at key locations, and/or building a list of them > as they are encountered, and disposing of them only once output is > flushed, I imagine it could be achieved. But it would be a significant > effort and demand deep familiarity with the entire GNU troff parser, not > just the bulk of it in input.cpp. > > My own knowledge of it is far advanced over what it was 5 years ago, but > I do not feel equal to the task of scaling that mountain yet. > > So I think this is unlikely to happen soon. I have patience. :) I'll keep pushing you from time to time with NP-Hard and NP-Complete problems. > [2] ...the presentation of which in _The Unix Programming Environment_, > my labored explanation is but a pale shadow. I wish someone would > update that book for modern times; the currents of history have been > particularly cruel to its old-school lex and yacc usage, which is > some of the most valuable material in it. Talking of yacc, maybe you could have a look at a bugfix I wrote recently for some yacc code in shadow-utils. Nobody involved in the project seems to understand it anymore. :) <https://github.com/shadow-maint/shadow/pull/952> Have a lovely night! Alex -- <https://www.alejandro-colomar.es/> Looking for a remote C programming job at the moment.
signature.asc
Description: PGP signature