Re: [Groff] unicode support - where to compose?

Bruno Haible Thu, 23 Feb 2006 12:47:25 -0800

Hello Werner,

> it's probably something which should be done later.


Well, I'm on it now because I feel more comfortable dealing with the CJK
width and other smaller problems once the logic of decomposition and
combined characters is done right.

> Currently, groff only recognizes a very limited set of
> ligatures (fi, ff, etc.)

Ah, good example! This is already an precedent where groff combines adjacent
input nodes. When the user doesn't want the ligature, he can use \& as
a separator between the two, right? So I imagine that noone will
object if troff combines
             x\[u0302]\[u0301]
into
             \[u0078_0302_0301]

> > In the first case I would put the composition into troff.
>
> OK.  With other words, it won't be handled yet.
>
> > In the second case into preconv (i.e. preconv would translate
> > <U+0078><U+0302><U+0301> to \[x u0302 u0301] but would leave alone
> > x\[u0302]\[u0301]).
>
> This would be perfect.

Hmm? Why do you qualify the second approach as "perfect", when the
other one is more in line with the mechanics how ligatures work?

I just wish to know which of the two is preferrable.

> If I understand you correctly, your approach
> will be table-driven, this is, a combining character following a base
> character will automatically be converted to the \[xxx yyy ...] form,
> right?

Yes, sure. The input stream of Unicode characters already tells us, through
the UnicodeData table, which characters are combining and decorate the
preceding base character.

Bruno



_______________________________________________
Groff mailing list
Groff@gnu.org
http://lists.gnu.org/mailman/listinfo/groff

Re: [Groff] unicode support - where to compose?

Reply via email to