Re: Proposed: make \X read its argument in copy mode

G. Branden Robinson Sat, 20 Jan 2024 14:17:39 -0800

Hi Deri,

At 2024-01-20T21:03:15+0000, Deri wrote:
> On Saturday, 20 January 2024 01:39:21 GMT G. Branden Robinson wrote:
[snip]
> > x X ps: exec [4849204445524920F09F9888 pdfmark2
> > 
> > Something pretty close to that works on the deri-gropdf-ng branch
> > today, as I understand it.
> 
> I'm afraid this is all wrong (or at least out of date, my private
> branch, which is rebased against a very recent HEAD, does not use
> stringhex as part of the interface with gropdf,

Ahh.  A day without wrongness is like Mordor without orcs.  😅

> it only uses it to build register names which need to include unicode
> characters with in the name).

Yes.  I may have a minor issue with that from a robustness perspective
but it doesn't have anything to do with \X or device control commands;
it's purely a macro programming level matter.  When I get some round
tuits I'll raise it in a new thread or a Savannah ticket.  And I'll
try to check my facts first.  ;-)

> In fact you know all this since you recently wrote:-

Plenty of people know what they don't know, and plenty more don't know
what they don't know, but I would claim that it takes real talent to not
know what you DO know.

> As an example, if this was in a file.mom:-
> 
> .HEADING 1 "Гуляйпольщина или Махновщина"
> 
> After running through preconv the resultant grout is:-
> 
> x X ps:exec [/Dest /pdf:bm24 /Title (8. \[u0413]\[u0443]\[u043B]\[u044F]\
> [u0439]\[u043F]\[u043E]\[u043B]\[u044C]\[u0449]\[u0438]\[u043D]\[u0430] \
> [u0438]\[u043B]\[u0438] \[u041C]\[u0430]\[u0445]\[u043D]\[u043E]\[u0432]\
> [u0449]\[u0438]\[u043D]\[u0430]) /Level 2 /OUT pdfmark
> 
> And the entry in the pdf looks like this:-
> 
> 99 0 obj << /Dest /pdf:bm24
> /Next 100 0 R
> /Parent 77 0 R
> /Prev 98 0 R
> /Title 
> (\376\377\0\70\0\56\0\40\4\23\4\103\4\73\4\117\4\71\4\77\4\76\4\73\4\114\4\111\4\70\4\75\4\60\0\40\4\70\4\73\4\70\0\40\4\34\4\60\4\105\4\75\4\76\4\62\4\111\4\70\4\75\4\60)
> >>
> endobj
> 
> The preconv unicodes have been converted to octal bytes with a UTF-16
> BOM on the front,

As a terminology stickler, I would not call these "preconv unicodes",
and IMO UTF-16 should usually be spelled with the endianess included...
But, yes, I take your point.

> and a pdf viewer will show the string with unicode characters in its
> bookmark panel. No stringhex involved, just passing preconv output
> straight to gropdf.

Cool.  I perceive that something I want is a unit test for this,
possibly a minimal mom(7) document containing the foregoing heading and
as little else as possible.  So I'll work on that while the
\X-copy-mode item percolates on the discussion table a while longer.

(Who, me, mix metaphors?)

> This is exactly the technique I am now using. Whatever preconv
> produces, ends up as a UTF-16 string. You can mix normal text with the
> preconv output, (and groff characters like \[em]), but as soon as any
> character in the string requires unicode the whole string is
> converted.

This seems like a reasonable approach, to keep from having to manage
state.  ("Are we in ASCII mode or octal mode?")

Regards,
Branden

signature.asc
Description: PGP signature

Re: Proposed: make \X read its argument in copy mode

Reply via email to