On Thu, Jul 07, 2005 at 09:40:34PM +0200, Bruno Haible wrote: > Andries Brouwer wrote: > > If there is a pipeline, then earlier > > stages in the pipeline already need the character set. > > So, conversion may have to be done before the input reaches groff.
> Btw, if a program in the pipeline, before groff, actually needs the > character set, it will be able to infer it from the "coding:" marker. > Whereas in the past, without a marker, it cannot know whether it's processing > something in KOI8-R or ISO-8859-5. > > > And that also brings up a different point. If I have a file > > that has topline -*- coding: EUC-JP -*- and I feed it to > > a program like iconv, must that program change the topline? > > The "gpreconv" filter must be idempotent: > gpreconv | gpreconv == gpreconv. > Whether it achieves this by converting the input to UTF-8 and changing the > marker to "coding: utf-8", or whether it converts the input to ASCII with > lots of \[...] or \N[...] escape sequences and leaves the marker in place, > is an unimportant detail. So - we now get a new converter, not iconv, but a special-purpose gpreconv filter. It knows that it is converting things that will later be fed to groff. Pity. Where is my beautiful Unix? This converter may change the sequence of symbols in the file, not only the representation of these symbols. Ach. It is not at all an unimportant detail whether it changes to utf-8 or ascii with escape sequences. My own preprocessors halfway that pipeline do not know about utf-8, and do not know about these escape sequences either. Still I am told that compatibility mode should work. Does gpreconv also know whether groff will be called with the -C option? I foresee a complicated mess. Is it not far simpler to document that groff must be called with a file coded in ASCII or Latin-1 or UTF-8? Andries _______________________________________________ Groff mailing list Groff@gnu.org http://lists.gnu.org/mailman/listinfo/groff