https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71500
--- Comment #8 from Tim Shen <timshen at gcc dot gnu.org> --- (In reply to Michael Duggan from comment #7) > Hmm... Okay. For the sake of argument, I am going to make the > following claims: Yeah, thanks for the arguments, we should at least get this clear. > a) Ignoring case _requires_ a locale. Without a locale, how do you > determine the case of a character anyway? Especially if the > character is not a char? Even std::toupper and std::towuppwer are > bound to the "C" locale. > > As such, you can't say you're not to use the locale when collate is > off. Otherwise icase wouldn't work ever unless collate was > specified. Not necessarily. My interpretation of "icase" is whether regex should call traits_type::translate or traits_type::translate_nocase when comparing two characters. No locale is directly involved - a user-defined regex traits type may ignore the locale and inject its own weird stuff into translate_nocase. It's just, not surprisingly, std::regex_traits<T> "happens to" use locale to implement translate_nocase. > b) You can still use regex_traits::transform in an icase scenario. The > rule for [A-B], matching X should be: > > transform(a) <= transform(x) <= transform(b) > > In the icase scenario, I would posit that the result of this _or_ > > transform(a) <= transform(toggle_case(x)) <= transform(b) > > is correct. +1. I looked at boost source, it seems that "collate" is interpreted as whether regex should call traits_type::transform when doing range matching: https://github.com/boostorg/regex/blob/9059bfb5c6287b0c579bfa4be5160b44c8cc2957/include/boost/regex/v4/perl_matcher.hpp#L198 I don't like the way the documentation put it, but the definition above seems accurate and clear. transform/collate seems to be an unrelated concern. With above two claims hold, my point is, the way I can think of to implement toggle_case: char_type basic_regex<...>::toggle_case(char_type c) { const auto& ctype = std::use_facet<std::ctype<char_type>>(this->_M_traits.getloc()); auto lower = ctype.tolower(c); if (lower == c) return ctype.toupper(c); return lower; } it uses locale's definition of uppercase/lowercase, not translate_nocase's definition. This result might surprise a picky user. There is a correct way to implement this: if (icase) c = traits.translate_nocase(c); else c= traits.translate(c); for (pair<char_type, char_type> range : regex.range_set) for (char_type e = range.first; e != range.second+1; ++e) { if (icase) e = traits.translate_nocase(e); else e = traits.translate(e); if (collate) { if (traits.transform(e) == traits.transform(c)) return true; } else { if (e == c) return true; } } } return false; But it's terribly inefficient and not practical at all.