Hello Werner, > it's probably something which should be done later.
Well, I'm on it now because I feel more comfortable dealing with the CJK width and other smaller problems once the logic of decomposition and combined characters is done right. > Currently, groff only recognizes a very limited set of > ligatures (fi, ff, etc.) Ah, good example! This is already an precedent where groff combines adjacent input nodes. When the user doesn't want the ligature, he can use \& as a separator between the two, right? So I imagine that noone will object if troff combines x\[u0302]\[u0301] into \[u0078_0302_0301] > > In the first case I would put the composition into troff. > > OK. With other words, it won't be handled yet. > > > In the second case into preconv (i.e. preconv would translate > > <U+0078><U+0302><U+0301> to \[x u0302 u0301] but would leave alone > > x\[u0302]\[u0301]). > > This would be perfect. Hmm? Why do you qualify the second approach as "perfect", when the other one is more in line with the mechanics how ligatures work? I just wish to know which of the two is preferrable. > If I understand you correctly, your approach > will be table-driven, this is, a combining character following a base > character will automatically be converted to the \[xxx yyy ...] form, > right? Yes, sure. The input stream of Unicode characters already tells us, through the UnicodeData table, which characters are combining and decorate the preceding base character. Bruno _______________________________________________ Groff mailing list Groff@gnu.org http://lists.gnu.org/mailman/listinfo/groff