The stdio output functions have two bugs when it comes to output
to a Windows console.
Windows consoles come with two encodings: GetACP() and GetOEMCP(). For
Japanese, both have the same value (932). However, for English, German,
French Windows installations, GETACP() = 1252 and GetOEMCP() = 850.
For many years, output of non-ASCII characters to consoles was a PITA:
While the program had to produce output in GetACP() encoding when
writing to files, it had to produce output in GetOEMCP() encoding when
writing to a console. The majority of programs did not do this: they
produced output in GetACP() encoding always, and thus non-ASCII
characters got garbled in consoles.
After many many years, Microsoft finally added a workaround in the
C runtime library (msvcrt and ucrt). When a program writes a string
to a console, the runtime library tests whether the output goes
to a console, and if yes, it does a conversion from GetACP() encoding
to GetOEMCP() encoding on the fly, in two steps: from GetACP() to UTF-16
via MultiByteToWideChar, then to GetOEMCP() via WideCharToMultiByte.
This workaround works fine in ucrt. But in msvcrt this workaround
has two bugs. Both happen when
- The output goes to a console. (No bug when the output goes to a file.)
and
- The stream's mode is _O_TEXT. (Which is the default for stdout
and stderr. No bug when the stream's mode is _O_BINARY.)
and
- setlocale() is called before. (No bug if setlocale() is not called,
that is, when the locale remains the "C" locale.)
and
- The chosen locale has a double-byte encoding, such as CP932.
(No bug for unibyte locale encodings, such as CP1252.)
and
- The console's codepage matches the locale's encoding. For
example, after 'chcp 932' was executed.
Bug 1:
When the application outputs double-byte characters one byte at
a time, using the functions fputc() or putc(), the console shows JISX0201
(ASCII and Katakana) characters instead of CP932 (ASCII, Katakana,
Hiragana, Hanzi) characters.
How to reproduce:
1. Use Windows 10 or 11. Switch it to Japanese as main language.
2. Use the attached program. In the dev environment:
$ gcc -Wall foo.c
3. In a cmd.exe console:
$ chcp 932
$ .\a
Look at the output of the parts C and D.
Bug 2:
When the application outputs a string, that starts with a non-ASCII
character, using the function fwrite(), the console shows no output,
and the stream's error indicator gets set.
How to reproduce:
1. Use Windows 10 or 11. Switch it to Japanese as main language.
2. Use the attached program. In the dev environment:
$ gcc -Wall foo.c
3. In a cmd.exe console:
$ chcp 932
$ .\a
Look at the output of the parts E and F.
I don't plan to add workarounds for these bugs to Gnulib, because
* Normal applications don't write strings one byte at a time, for
speed.
* Normal applications use fwrite() for binary I/O and fputs() or
[v][f]printf or similar for text I/O.
If anyone wants these bugs fixed, they will have to build their
application against ucrt instead of msvcrt. The MSYS2 project
contains tools and libraries for mingw+ucrt. (Btw, building with
ucrt instead of msvcrt also has the benefit of supporting the
UTF-8 locales of Windows. [1][2])
[1]
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale
"Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
Runtime supports using a UTF-8 code page."
[2] https://lists.gnu.org/archive/html/bug-gnulib/2024-12/msg00159.html
2025-09-16 Bruno Haible <[email protected]>
Document msvcrt (native Windows) bugs regarding console output.
* doc/posix-functions/fputc.texi: Document a bug found in msvcrt.
* doc/posix-functions/putc.texi: Likewise.
* doc/posix-functions/fwrite.texi: Document another bug found in msvcrt.
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <locale.h>
#include <fcntl.h>
/* These rpl_* functions disable possible gcc optimizations. */
int
rpl_printf (const char *format, ...)
{
int retval;
va_list args;
va_start (args, format);
retval = vfprintf (stdout, format, args);
va_end (args);
return retval;
}
int
rpl_fprintf (FILE *stream, const char *format, ...)
{
int retval;
va_list args;
va_start (args, format);
retval = vfprintf (stream, format, args);
va_end (args);
return retval;
}
int
rpl_vprintf (const char *format, va_list args)
{
return vfprintf (stdout, format, args);
}
int
rpl_vfprintf (FILE *stream, const char *format, va_list args)
{
return vfprintf (stream, format, args);
}
int
rpl_putchar (int c)
{
return fputc (c, stdout);
}
int
rpl_fputc (int c, FILE *stream)
{
return fputc (c, stream);
}
int
rpl_fputs (const char *string, FILE *stream)
{
return fputs (string, stream);
}
int
rpl_puts (const char *string)
{
return puts (string);
}
size_t
rpl_fwrite (const void *ptr, size_t s, size_t n, FILE *stream)
{
return fwrite (ptr, s, n, stream);
}
int
main (int argc, char *argv[])
{
// When the output is redirected to a file, all outputs are correct.
// When the program is compiled with ucrt (as opposed to msvcrt),
// all outputs are correct.
// When the mode is set to _O_BINARY, all outputs are correct.
//_setmode (1, _O_BINARY);
const char *text;
if (1)
{
// When setlocale is not called, all outputs are correct.
setlocale (LC_ALL, "Japanese_Japan.932");
text = "\203\111\203\166\203\126\203\207\203\223\202\306\210\370\220\224\072\n";
}
else
// When a single-byte locale (e.g. CP1252) is used, all outputs are correct.
{
setlocale (LC_ALL, "German_Germany.1252");
text = "B\366se B\374bchen tun Bu\337e\n";
}
// __USE_MINGW_ANSI_STDIO=0 __USE_MINGW_ANSI_STDIO=1
puts ("A");
clearerr (stdout);
// correct correct
fputs (text, stdout);
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("B");
clearerr (stdout);
// correct correct
rpl_fputs (text, stdout);
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("C");
clearerr (stdout);
// garbled garbled
for (const char *s = text; *s; s++)
fputc (*s, stdout);
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("D");
clearerr (stdout);
// garbled garbled
for (const char *s = text; *s; s++)
rpl_fputc (*s, stdout);
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("E");
clearerr (stdout);
// no output no output
fwrite (text, 1, strlen (text), stdout);
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("F");
clearerr (stdout);
// no output no output
rpl_fwrite (text, 1, strlen (text), stdout);
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("G");
clearerr (stdout);
// correct garbled
fprintf (stdout, "%s%s%s", "", text, "");
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
puts ("H");
clearerr (stdout);
// correct garbled
rpl_fprintf (stdout, "%s%s%s", "", text, "");
fflush (stdout);
if (ferror (stdout)) puts ("ferror() -> true!");
// Whenever the result is "garbled" or "no output",
// the stream's error indicator is set, i.e. ferror() returns true.
exit (EXIT_SUCCESS);
}
>From 901563ae363e4816b9b7ecdb154910e18b6052ca Mon Sep 17 00:00:00 2001
From: Bruno Haible <[email protected]>
Date: Tue, 16 Sep 2025 17:08:44 +0200
Subject: [PATCH] Document msvcrt (native Windows) bugs regarding console
output.
* doc/posix-functions/fputc.texi: Document a bug found in msvcrt.
* doc/posix-functions/putc.texi: Likewise.
* doc/posix-functions/fwrite.texi: Document another bug found in msvcrt.
---
ChangeLog | 7 +++++++
doc/posix-functions/fputc.texi | 6 ++++++
doc/posix-functions/fwrite.texi | 6 ++++++
doc/posix-functions/putc.texi | 6 ++++++
4 files changed, 25 insertions(+)
diff --git a/ChangeLog b/ChangeLog
index 73b7ff269c..d6a4df6cd0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2025-09-16 Bruno Haible <[email protected]>
+
+ Document msvcrt (native Windows) bugs regarding console output.
+ * doc/posix-functions/fputc.texi: Document a bug found in msvcrt.
+ * doc/posix-functions/putc.texi: Likewise.
+ * doc/posix-functions/fwrite.texi: Document another bug found in msvcrt.
+
2025-09-16 Bruno Haible <[email protected]>
strerror_r: Ensure a trailing NUL when truncating.
diff --git a/doc/posix-functions/fputc.texi b/doc/posix-functions/fputc.texi
index de80da596c..892243ae87 100644
--- a/doc/posix-functions/fputc.texi
+++ b/doc/posix-functions/fputc.texi
@@ -32,6 +32,12 @@
On Windows platforms (excluding Cygwin), this function does not set @code{errno}
upon failure.
@item
+This function fails and produces garbled output
+when invoked twice, for outputting a non-ASCII character in double-byte encoding,
+corresponding to the locale, on some platforms:
+mingw in combination with msvcrt,
+when the output goes to a Windows console.
+@item
On some platforms, this function does not set @code{errno} or the
stream error indicator on attempts to write to a read-only stream:
Cygwin 1.7.9.
diff --git a/doc/posix-functions/fwrite.texi b/doc/posix-functions/fwrite.texi
index 5cd99e5940..3922a9f1db 100644
--- a/doc/posix-functions/fwrite.texi
+++ b/doc/posix-functions/fwrite.texi
@@ -32,6 +32,12 @@
On Windows platforms (excluding Cygwin), this function does not set @code{errno}
upon failure.
@item
+This function fails and produces no output
+when the argument string starts with a non-ASCII character in double-byte encoding,
+corresponding to the locale, on some platforms:
+mingw in combination with msvcrt,
+when the output goes to a Windows console.
+@item
On some platforms, this function does not set @code{errno} or the
stream error indicator on attempts to write to a read-only stream:
Cygwin 1.7.9.
diff --git a/doc/posix-functions/putc.texi b/doc/posix-functions/putc.texi
index aec5b7e7d8..e956601631 100644
--- a/doc/posix-functions/putc.texi
+++ b/doc/posix-functions/putc.texi
@@ -32,6 +32,12 @@
On Windows platforms (excluding Cygwin), this function does not set @code{errno}
upon failure.
@item
+This function fails and produces garbled output
+when invoked twice, for outputting a non-ASCII character in double-byte encoding,
+corresponding to the locale, on some platforms:
+mingw in combination with msvcrt,
+when the output goes to a Windows console.
+@item
On some platforms, this function does not set @code{errno} or the
stream error indicator on attempts to write to a read-only stream:
Cygwin 1.7.9.
--
2.50.1