https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82366
Karol Zwolak <karolzwolak7 at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |karolzwolak7 at gmail dot com --- Comment #13 from Karol Zwolak <karolzwolak7 at gmail dot com> --- I recently encountered a very similar crash when using `std::regex`. I'm not sure whether this is a GCC/libstdc++ bug or just a consequence of inconsistent compilation settings, but here is what I found. The crash originates from code like this in `locale_classes.tcc` (these links point to GCC 13, but the logic is similar in earlier versions): https://github.com/gcc-mirror/gcc/blob/97454afb368f79783e99eafee009c88aa4e16845/libstdc++-v3/include/bits/locale_classes.tcc#L94-L95 https://github.com/gcc-mirror/gcc/blob/97454afb368f79783e99eafee009c88aa4e16845/libstdc++-v3/include/bits/locale_classes.tcc#L139-L140 https://github.com/gcc-mirror/gcc/blob/97454afb368f79783e99eafee009c88aa4e16845/libstdc++-v3/include/bits/locale_classes.tcc#L200-L203 Relevant excerpt: ```cpp template<typename _Facet> inline const _Facet* __try_use_facet(const locale& __loc) _GLIBCXX_NOTHROW { const size_t __i = _Facet::id._M_id(); const locale::facet** __facets = __loc._M_impl->_M_facets; if (__i >= __loc._M_impl->_M_facets_size || !__facets[__i]) return 0; return static_cast<const _Facet*>(__facets[__i]); } inline const _Facet& use_facet(const locale& __loc) { if (const _Facet* __f = std::__try_use_facet<_Facet>(__loc)) return *__f; __throw_bad_cast(); } ``` In my case, the crash occurred because `use_facet` failed with `std::bad_cast` due to a mismatch in the facet ID values between the binary and a shared library. Root Cause: * Both the main binary and a shared library use components (such as `std::regex`) that depend on `std::locale` facets. * Facets are initialized once, and each has a globally unique ID (`_M_id`). * If these components are not using the same symbols to get the id, they may get different versions of the facet ID symbol (`std::ctype<char>::id`, for example). * In my case, the binary had its own copy of `std::ctype<char>::id` in the `.bss` section (symbol type `B`), while the library referenced the symbol dynamically (`U`, undefined). * As a result, when the library tried to access the facet, it got an ID (`_M_id`) that didn’t match the facet table, leading to a `std::bad_cast`. Diagnosis: You can check for inconsistent symbols like this: ``` nm -C yourlib.so | grep 'ctype<char>::id' nm -C your_binary | grep 'ctype<char>::id' ``` Or, if your crash is in `collate<char>`: ``` nm -C yourlib.so | grep 'collate<char>::id' nm -C your_binary | grep 'collate<char>::id' ``` If the symbol appears with `B` in one and `U` in the other, you may have this mismatch. It’s also possible that both define the symbol, which can still lead to divergent facet IDs. You can also verify at runtime: ```cpp // prefer to use printf while debugging it as using `std::cout` may use facets under the hood as well printf("ctype id: %ld\n", std::ctype<char>::id._M_id()); ``` This value must be identical across all binaries and libraries. If it differs, it means you have broken facet identity. Here's the final version with everything integrated, including a note that you couldn't reproduce the ID mismatch in a minimal example: Possible fix: To avoid this mismatch: 1. Ensure all components (binaries and shared libraries) that use `std::regex`, `std::locale`, or locale facets are compiled with `-fPIC`. 2. Alternatively, statically link libstdc++ (if feasible), though this may introduce other complications. In my case, compiling the binary with -fPIC resolved the crash by ensuring consistent symbol resolution. Without -fPIC, the binary can end up with its own copy of static data symbols like std::ctype<char>::id, while shared libraries expect to resolve those symbols dynamically via libstdc++. This discrepancy causes the facet ID values (from id._M_id()) to differ between components, leading to failures like std::bad_cast when std::use_facet is used. It’s worth emphasizing that all components in my case were using the same (dynamically linked) copy of libstdc++. The issue wasn’t caused by multiple libstdc++ instances, but rather by symbol duplication and inconsistency in initialization across non-PIC vs PIC-compiled code. This workaround doesn’t address the root cause — the inconsistent facet ID assignments — but it ensures that all parts of the system share the same symbol and initialization state at runtime, effectively avoiding the crash. Note: I wasn’t able to create a small standalone reproducer where facet IDs differ. The issue only manifested in a larger setup with real binaries and libraries. So while this solution is effective in practice, the underlying mismatch may be subtle and depend on specific linker behavior or initialization order.