On Mon 30 Mar 2020 at 07:26:06 (-0400), Greg Wooledge wrote: > On Fri, Mar 27, 2020 at 11:15:12PM -0500, David Wright wrote: > > (BTW I'm not sure about Reco's use of \uc899. Does \u mean that > > c899 is in utf-8, or should it be followed by a Unicode codepoint, > > as in U+c899? If the latter, then \uc899 is way off my charts.) > > It's a notation used in some programming tools/environments to denote > a Unicode code point. > > E.g. bash's printf and $'...' accept \unnnn or \Unnnnnnnn to denote > Unicode code points using either 4 or 8 hex digits. > > $ printf '\u00f1\n' > ñ
Of course! With well over a decade of using Unicode natively, I had forgotten that people would want to do that. And refreshing bash's prompt escape syntax last December (\D{}, \u etc) completely wiped it from my mind. So one has to be careful not to translate Content-Disposition: attachment; filename*=utf-8''%C8%99urubelni%C8%9B%C4%83_empty%2Etxt into Unicode codepoints like \uc899 because it's utf-8, not utf-32. Cheers, David.