Tim Landscheidt wrote: > I don't know what the result > of `quotearg ("äöü")' should look like and what it should > depend on.
It depends on the output destination of the quoted string. If it is for output on stderr, like in bison/src/parse-gram.y:193 %printer { fputs (quotearg_style (c_quoting_style, $$), stderr); } then you can most likely emit "äöü" with the same multibyte characters. If it is for inclusion in a Java program, in comments, you also don't need to do particular processing of multibyte characters. If it is for use as a literal string in a Java program, then the interpretation of source code depends on the -encoding parameter passed as argument to the Java compiler (see [1]). If you emit "äöü" directly into the source code, the developer needs to add a -encoding option; this is normally not welcome. To avoid this, the notation \unnnn can be used in strings for UTF-16 codepoints, excluding LF and CR (\u000A and \u000D are invalid inside strings). So, the algorithm is: - Determine the encoding of the string's origin (if it's from a file name or a tty, you can assume locale_charset() is the right guess; if it's from a file, use a command-line argument to specify its encoding). - Convert the multibyte string to UTF-16 (either through module 'striconv' or through a hand-written code in the same style as lib/unicodeio.c [just in the reverse direction]). - Replace LF with \n, CR with \r, and all other UTF-16 code points outside the range U+0020..U+007E with \unnnn. Bruno [1] http://download.oracle.com/javase/1,5.0/docs/tooldocs/solaris/javac.html -- In memoriam Louis Philippe d'Orléans <http://en.wikipedia.org/wiki/Louis_Philippe_II,_Duke_of_Orléans>