I'm having a problem with Cygwin 3.1.4, changing the character set on the fly. It seems to work with Cygwin applications, but not with Win32 applications.
I have a Korn shell script: #!/bin/ksh OLD_LANG="$LANG" OLD_LC_ALL="$LC_ALL" echo "locale on entry" locale echo "" export LANG="en_US.CP1252" export LC_ALL=en_US.CP1252 echo "locale changed to" locale echo "" # Default is to run the Win32 program. Input any argument other than 'WIN32' # to run '/bin/echo'. case $# in 0 ) echo "Running WIN32 pgm" ksh -c 'cygtest.exe ZÇ' ;; 1 ) echo "Running Cygwin 'echo'" ksh -c '/bin/echo ZÇ' ;; 2 ) echo "Running WIN32 pgm" ksh -c 'cygtest.exe ZÇ' echo "" echo "Running Cygwin 'echo'" ksh -c '/bin/echo ZÇ' ;; * ) ;; esac LC_ALL="$OLD_LC_ALL" LANG="$OLD_LANG" and a Win32 application (attached file cygtest.cpp) I used gdb to see what was happening in child_info_spawn::worker(), when a Win32 program is started using: rc = CreateProcessW (runpath, /* image name w/ full path */ cmd.wcs (wcmd), /* what was passed to exec */ sa, /* process security attrs */ sa, /* thread security attrs */ TRUE, /* inherit handles */ c_flags, envblock, /* environment */ NULL, &si, &pi); Specifically, 'cmd.wcs(wcmd)' invokes: wchar_t *wcs (wchar_t *wbuf, size_t n) { if (n == 1) wbuf[0] = L'\0'; else sys_mbstowcs (wbuf, n, buf); return wbuf; } and sys_mbstowcs(): size_t __reg3 sys_mbstowcs (wchar_t * dst, size_t dlen, const char *src, size_t nms) { mbtowc_p f_mbtowc = __MBTOWC; if (f_mbtowc == __ascii_mbtowc) { f_mbtowc = __utf8_mbtowc; <<<<< this is ALWAYS done, no matter what charset is in use. } return sys_cp_mbstowcs (f_mbtowc, dst, dlen, src, nms); } Since the CP1252 is an 8-bit single-byte character set with characters >= 0x80, the '0xc7' character is always translated as '0xc7 0xf0', with the '0xf0' byte indicating an invalid character in the string. This doesn't seem to happen when e.g. '/bin/echo' is run, although I haven't stepped into the code to see what's happening. I do not think this is a Cygwin bug, but since the User's Guide says the locale and charset can be changed on the fly, I don't know what's going awry. Any suggestions? If you need more information, I'm happy to provide it. Mike Shay Here's the source for the Win32 program. I built it with Visual Studio 2015, to get something running quickly. NOTICE from Ab Initio: This email (including any attachments) may contain information that is subject to confidentiality obligations or is legally privileged, and sender does not waive confidentiality or privilege. If received in error, please notify the sender, delete this email, and make no further use, disclosure, or distribution.
cygtest.cpp
Description: Binary data
-- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple