https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115064

            Bug ID: 115064
           Summary: Possible problem with money_put
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lcarreon at bigpond dot net.au
  Target Milestone: ---

I have been experimenting with std::money_put and discovered the following:

#include <locale>
#include <iostream>
#include <iomanip>
#include <sstream>
#include <format>

int main()
{
  std::locale loc{std::locale{"de_DE.utf8"}};
  std::cout.imbue(loc);

  auto money = 123456789.0L;

  {
    const auto& money_put =
std::use_facet<std::money_put<char>>(std::cout.getloc());
    std::cout << "\"" << std::showbase << std::setw(20) << std::right;
    money_put.put(std::cout, false, std::cout, '*', money);
    std::cout << "\"" << std::endl;
  }

  {
    std::ostringstream str;
    str.imbue(loc);
    const auto& money_put = std::use_facet<std::money_put<char>>(str.getloc());
    str << std::showbase << std::setw(20) << std::right;
    money_put.put(str, false, str, '*', money);
    str << std::endl;
    for (const auto& ch : str.str())
    {
      std::cout << std::format("{:02x} ", ch);
    }
    std::cout << std::endl;
  }

  return 0;
}

I compiled the above program using g++ 14.0.1 using c++20 mode. When I execute
the program on a terminal whose LANG is set to en_AU.utf8, I get the following
result:

"****1.234.567,89*€"
2a 2a 2a 2a 31 2e 32 33 34 2e 35 36 37 2c 38 39 2a e2 82 ac 0a

The first line above is the problematic one because it shows only 18 characters
inside the double quotes instead of 20.  The second line above explains why. 
The euro symbol is made up of three octets which is upsetting std::setw(20).

I revised the above program using wchar_t and std::wcout as follows:

#include <locale>
#include <iostream>
#include <iomanip>
#include <sstream>
#include <format>

int main()
{
  std::locale loc{std::locale{"de_DE.utf8"}};
  std::wcout.imbue(loc);

  auto money = 123456789.0L;

  {
    const auto& money_put =
std::use_facet<std::money_put<wchar_t>>(std::wcout.getloc());
    std::wcout << L"\"" << std::showbase << std::setw(20) << std::right;
    money_put.put(std::wcout, false, std::wcout, L'*', money);
    std::wcout << L"\"" << std::endl;
  }

  {
    std::wostringstream str;
    str.imbue(loc);
    const auto& money_put =
std::use_facet<std::money_put<wchar_t>>(str.getloc());
    str << std::showbase << std::setw(20) << std::right;
    money_put.put(str, false, str, L'*', money);
    str << std::endl;
    for (const auto& ch : str.str())
    {
      std::wcout << std::format(L"{:02x} ", static_cast<unsigned>(ch));
    }
    std::wcout << std::endl;
  }

  return 0;
}

Executing the above program on the same terminal, I get the following:

"******1.234.567,89*EUR"
2a 2a 2a 2a 2a 2a 31 2e 32 33 34 2e 35 36 37 2c 38 39 2a 20ac 0a

The first line above now has 22 characters instead of 20.  The second line
above shows what characters were sent to the terminal.  It seems the euro
symbol "20ac" is displayed by the terminal as "EUR" instead of the single glyph
"€".

I understand that the terminal is set for UTF-8, thus the correct display of
the euro symbol in char mode.  But why display "EUR" in wchar_t mode?

Reply via email to