https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71500
Michael Duggan <mwd at md5i dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mwd at md5i dot com
--- Comment #1 from Michael Duggan <mwd at md5i dot com> ---
I can confirm this issue exists in debian's libstdc++-6-dev (6.1.1-5) package.
I've done some tracing, and here is what I have been able to determine:
(All of the below refers to functions in bits/regex_compiler.tcc.)
When std::__detail::_BracketMatcher<std::__cxx11::regex_traits<char>, true,
false>::_M_apply is called, _M_char_set contains just {'a'}, and _M_range_set
contains {{first='A', second='F'}}.
When looking up a character in the _M_char_set, the character is lowercased
(because __icase is true) before looking it up in the set. This is how 'A' and
'a' succeed.
When looking up 'F', the character is not found in the _M_char_set, so the
_M_range_set is checked. I don't know what the purpose of
_M_translator._M_transform(__ch) is, but since __collate is false, it does
nothing, leaving the character (__s) as 'F'. It then checks that 'F' is
between 'A' and 'F', which is true. Success.
When looking up 'f', the character is not found in the _M_char_set, so the
_M_range_set is checked. 'f' is not found to be between 'A' and 'F', so the
match fails.
When the regex is case insensitive, I believe the following has to happen.
Since it is mostly futile to lower-case a range ([T-f], for example), I think a
candidate char should probably be lower-cased and checked against a range set,
and if that fails, upper-cased and checked against a range set.
That said, any solution that works would be good.
(Note: The calls to _M_apply in the test case will happen when building the
_BracketMatcher's _M_cache, not when the actual regex_match happens.)