Paul Eggert wrote: > my earlier patch > neglected the possibility that mbrtowc can return 0
I wouldn't see this as a bug: You can assume that mbrtowc returns 0 if and only if the multibyte sequence is a NUL byte - but you had chosen srcend in such a way that this would not happen in the loop. > and it incorrectly assumed > wide control characters always have a single-byte representation. Oops, you're right. My mistake as well. The new patch looks good. This will catch (and replace with '?') U+2028 and U+2029 on glibc systems. On macOS, it will not do this, because iswcntrl(0x2028) and iswcntrl(0x2029) is 0 on this system; this is consistent with the fact that the 'Terminal' program displays these characters as simple spaces. So, no need to override iswcntrl on macOS. Bruno 2018-07-27 Bruno Haible <br...@clisp.org> iswcntrl: Mention minor problem on macOS. * doc/posix-functions/iswcntrl.texi: Mention oddity on macOS. diff --git a/doc/posix-functions/iswcntrl.texi b/doc/posix-functions/iswcntrl.texi index 99eaa0e..44dd034 100644 --- a/doc/posix-functions/iswcntrl.texi +++ b/doc/posix-functions/iswcntrl.texi @@ -25,4 +25,8 @@ Portability problems not fixed by Gnulib: @item On AIX and Windows platforms, @code{wchar_t} is a 16-bit type and therefore cannot accommodate all Unicode characters. +@item +This function returns 0 for U+2028 (LINE SEPARATOR) and +U+2029 (PARAGRAPH SEPARATOR) on some platforms: +Mac OS X 10.13. @end itemize