https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118105

            Bug ID: 118105
           Summary: std::regex_traits::transform_primary is not correct
                    and might be unimplementable
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redi at gcc dot gnu.org
            Blocks: 102445
  Target Milestone: ---

Our code says:

       * Effects: if typeid(use_facet<collate<_Ch_type> >) ==
       * typeid(collate_byname<_Ch_type>) and the form of the sort key
       * returned by collate_byname<_Ch_type>::transform(__first, __last)
       * is known and can be converted into a primary sort key
       * then returns that key, otherwise returns an empty string.
       *
       * @todo Implement this function correctly.
       */
      template<typename _Fwd_iter>
        string_type
        transform_primary(_Fwd_iter __first, _Fwd_iter __last) const
        {
          // TODO : this is not entirely correct.
          // This function requires extra support from the platform.
          //
          // Read http://gcc.gnu.org/ml/libstdc++/2013-09/msg00117.html and
          // http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2003/n1429.htm
          // for details.
          typedef std::ctype<char_type> __ctype_type;
          const __ctype_type& __fctyp(use_facet<__ctype_type>(_M_locale));
          _GLIBCXX_STD_C::vector<char_type> __s(__first, __last);
          __fctyp.tolower(__s.data(), __s.data() + __s.size());
          return this->transform(__s.data(), __s.data() + __s.size());
        }


N1429 says:

Note also that there is no portable way to implement transform_primary in terms
of std::locale, since even if the sort key format returned by
std::collate_byname<>::transform is known and can be converted into a primary
sort key, the user can still install their own custom std::collate
implementation into the locale object used, and that can use any sort key
format they see fit. The transform_primary member function is therefore more of
use to custom traits classes, and should throw an exception if it cannot be
implemented for a particular locale. Unfortunately this significantly reduces
the usefulness of POSIX style equivalence classes within regular expressions,
but that cannot be fixed without modifying the std::collate facet. Note that
primary sort keys can not be obtained by converting to all lower case and then
obtaining a regular sort key: primary keys take into account only the primary
character shape, case, accentation and locale specific tailoring are not taken
into account, so for example the characters "AÀÁÂÃÄÅaàáâãäå" should all produce
the same primary sort key. 

This should probably not have been added to the standard!

Anyway, we should at least do the typeid check that the standard requires, so
that we know whether collate::transform can be used.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102445
[Bug 102445] [meta-bug] std::regex issues

Reply via email to