https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63776
--- Comment #8 from Tim Shen <timshen at gcc dot gnu.org> --- I'm not sure how you call boost::regex in your code, here's what I did: // g++ b.cc -lboost_regex -licuuc #include <boost/regex/icu.hpp> #include <boost/regex.hpp> #include <iostream> #include <string> using namespace boost; int main() { std::locale loc("en_US.UTF-8"); std::string s(u8"Ī"); u32regex re = make_u32regex("[[:alpha:]]"); std::cout << u32regex_match(s.data(), s.data() + s.size(), re) << "\n"; return 0; } If this is the way that we do utf-8 matching using boost, then I don't think std::regex_match and boost::u32regex_match (notice that it's not boost::regex_match) have the same semantic. An user who uses boost::u32regex_match explicitly tells the library that "I want a unicode match here, here's my regex object, with type u32regex, please do the decode for and match for me", and u32regex is actually boost::basic_regex< ::UChar32, icu_regex_traits> with a library defined regex_traits. u32regex_match, on the other hand, takes no user defined regex_traits type, but u32regex only. I don't think std::regex_match<BiIter, Alloc, char, RegexTraits> should care about decoding a char string to wchar_t string and call std::regex_match<AnotherBiIter, AnotherAlloc, wchar_t, std::regex_traits<wchar_t>>, leaving user defined RegexTraits potentially unused. Instead, user can maually decode the utf-8 string (I'm sad we don't have a standard char iterator adaptor which converts a utf-8 char iterator to char32_t iterator) and call std::regex_match<..., wchar_t, ...>. These are my understanding, so it's surely possible that I may miss something. Thoughts?