Hi Deri, At 2024-01-20T21:03:15+0000, Deri wrote: > On Saturday, 20 January 2024 01:39:21 GMT G. Branden Robinson wrote: [snip] > > x X ps: exec [4849204445524920F09F9888 pdfmark2 > > > > Something pretty close to that works on the deri-gropdf-ng branch > > today, as I understand it. > > I'm afraid this is all wrong (or at least out of date, my private > branch, which is rebased against a very recent HEAD, does not use > stringhex as part of the interface with gropdf,
Ahh. A day without wrongness is like Mordor without orcs. 😅
> it only uses it to build register names which need to include unicode
> characters with in the name).
Yes. I may have a minor issue with that from a robustness perspective
but it doesn't have anything to do with \X or device control commands;
it's purely a macro programming level matter. When I get some round
tuits I'll raise it in a new thread or a Savannah ticket. And I'll
try to check my facts first. ;-)
> In fact you know all this since you recently wrote:-
Plenty of people know what they don't know, and plenty more don't know
what they don't know, but I would claim that it takes real talent to not
know what you DO know.
> As an example, if this was in a file.mom:-
>
> .HEADING 1 "Гуляйпольщина или Махновщина"
>
> After running through preconv the resultant grout is:-
>
> x X ps:exec [/Dest /pdf:bm24 /Title (8. \[u0413]\[u0443]\[u043B]\[u044F]\
> [u0439]\[u043F]\[u043E]\[u043B]\[u044C]\[u0449]\[u0438]\[u043D]\[u0430] \
> [u0438]\[u043B]\[u0438] \[u041C]\[u0430]\[u0445]\[u043D]\[u043E]\[u0432]\
> [u0449]\[u0438]\[u043D]\[u0430]) /Level 2 /OUT pdfmark
>
> And the entry in the pdf looks like this:-
>
> 99 0 obj << /Dest /pdf:bm24
> /Next 100 0 R
> /Parent 77 0 R
> /Prev 98 0 R
> /Title
> (\376\377\0\70\0\56\0\40\4\23\4\103\4\73\4\117\4\71\4\77\4\76\4\73\4\114\4\111\4\70\4\75\4\60\0\40\4\70\4\73\4\70\0\40\4\34\4\60\4\105\4\75\4\76\4\62\4\111\4\70\4\75\4\60)
> >>
> endobj
>
> The preconv unicodes have been converted to octal bytes with a UTF-16
> BOM on the front,
As a terminology stickler, I would not call these "preconv unicodes",
and IMO UTF-16 should usually be spelled with the endianess included...
But, yes, I take your point.
> and a pdf viewer will show the string with unicode characters in its
> bookmark panel. No stringhex involved, just passing preconv output
> straight to gropdf.
Cool. I perceive that something I want is a unit test for this,
possibly a minimal mom(7) document containing the foregoing heading and
as little else as possible. So I'll work on that while the
\X-copy-mode item percolates on the discussion table a while longer.
(Who, me, mix metaphors?)
> This is exactly the technique I am now using. Whatever preconv
> produces, ends up as a UTF-16 string. You can mix normal text with the
> preconv output, (and groff characters like \[em]), but as soon as any
> character in the string requires unicode the whole string is
> converted.
This seems like a reasonable approach, to keep from having to manage
state. ("Are we in ASCII mode or octal mode?")
Regards,
Branden
signature.asc
Description: PGP signature
