On Mar 3, 2023, Jonathan Wakely <[email protected]> wrote:
> On Fri, 3 Mar 2023 at 09:33, Jonathan Wakely <[email protected]> wrote:
>> Jakub previously suggested doing this for PR 61841, which was a similar
>> problem with pthread_create:
>>
>> __asm ("" : : "r" (&pthread_create)); would not be optimized away.
>>
>>
>> That would avoid the multiple copies.
Not really. There would be multiple copies of the code that loads
pthread_create's address. And we don't really need the address, a
single never-executed call would do. I've explored these possibilities
a bit, and here's what I've come up with: a private static member
function that we output in units that instantiate the thread template
ctor, to pass its address to _M_start_thread. Since it's never actually
called, we don't really need the hacks in some of the alternatives I
left in place, mainly for your enjoyment.
They all work equally well, just as efficient per-instantiation at
runtime, a little different space and loading overheads, but the last
one, that is enabled, is my favorite: only PLT relocations, that we'd
likely get anyway, no full-address resolution, and as-short-as-possible
calls, enough to get a relocation with a strong reference to pull the
symbol in when linking, but as short as possible call sequences, because
of the type cast.
As a bonus, I put in (in the last minute, after my test runs) something
to keep even LTO happy: the asm statements to prevent depend from being
optimized out in _M_start_thread. In non-LTO, its impact should be
virtually zero.
How does this look? (minus the #if 0/#elif 0/.../#else)
link pthread_join from std::thread ctor
Like pthread_create, pthread_join may fail to be statically linked in
absent strong uses, so add to user code strong references to both when
std::thread objects are created.
for libstdc++-v3/ChangeLog
* include/bits/std_thread.h (thread::_M_thread_deps): New
static inline function.
(std::thread template ctor): Pass it to _M_start_thread.
* src/c++11/thread.cc (thread::_M_start_thread): Name depend
parameter, force it live on entry.
---
libstdc++-v3/include/bits/std_thread.h | 51 ++++++++++++++++++++++++++++----
libstdc++-v3/src/c++11/thread.cc | 10 +++++-
2 files changed, 52 insertions(+), 9 deletions(-)
diff --git a/libstdc++-v3/include/bits/std_thread.h
b/libstdc++-v3/include/bits/std_thread.h
index adbd3928ff783..3ffd2a823a698 100644
--- a/libstdc++-v3/include/bits/std_thread.h
+++ b/libstdc++-v3/include/bits/std_thread.h
@@ -132,6 +132,49 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
thread() noexcept = default;
#ifdef _GLIBCXX_HAS_GTHREADS
+ private:
+ // This adds to user code that creates std:thread objects (because
+ // it is called by the template ctor below) strong references to
+ // pthread_create and pthread_join, which ensures they are both
+ // linked in even during static linking. We can't depend on
+ // gthread calls to bring them in, because those may use weak
+ // references.
+ static void
+ _M_thread_deps_never_run() {
+#ifdef GTHR_ACTIVE_PROXY
+#if 0
+ static auto const __attribute__ ((__used__)) _M_create = pthread_create;
+ static auto const __attribute__ ((__used__)) _M_join = pthread_join;
+#elif 0
+ pthread_t thr;
+ pthread_create (&thr, nullptr, nullptr, nullptr);
+ pthread_join (thr, nullptr);
+#elif 0
+ asm goto ("" : : : : _M_never_run);
+ if (0)
+ {
+ _M_never_run:
+ pthread_t thr;
+ pthread_create (&thr, nullptr, nullptr, nullptr);
+ pthread_join (thr, nullptr);
+ }
+#elif 0
+ bool _M_skip_always = false;
+ asm ("" : "+rm" (_M_skip_always));
+ if (__builtin_expect (_M_skip_always, false))
+ {
+ pthread_t thr;
+ pthread_create (&thr, nullptr, nullptr, nullptr);
+ pthread_join (thr, nullptr);
+ }
+#else
+ reinterpret_cast<void (*)(void)>(&pthread_create)();
+ reinterpret_cast<void (*)(void)>(&pthread_join)();
+#endif
+#endif
+ }
+
+ public:
template<typename _Callable, typename... _Args,
typename = _Require<__not_same<_Callable>>>
explicit
@@ -142,18 +185,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
"std::thread arguments must be invocable after conversion to rvalues"
);
-#ifdef GTHR_ACTIVE_PROXY
- // Create a reference to pthread_create, not just the gthr weak symbol.
- auto __depend = reinterpret_cast<void(*)()>(&pthread_create);
-#else
- auto __depend = nullptr;
-#endif
using _Wrapper = _Call_wrapper<_Callable, _Args...>;
// Create a call wrapper with DECAY_COPY(__f) as its target object
// and DECAY_COPY(__args)... as its bound argument entities.
_M_start_thread(_State_ptr(new _State_impl<_Wrapper>(
std::forward<_Callable>(__f), std::forward<_Args>(__args)...)),
- __depend);
+ _M_thread_deps_never_run);
}
#endif // _GLIBCXX_HAS_GTHREADS
diff --git a/libstdc++-v3/src/c++11/thread.cc b/libstdc++-v3/src/c++11/thread.cc
index 2d5ffaf678e97..c91f7b02e1f3f 100644
--- a/libstdc++-v3/src/c++11/thread.cc
+++ b/libstdc++-v3/src/c++11/thread.cc
@@ -154,8 +154,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
void
- thread::_M_start_thread(_State_ptr state, void (*)())
+ thread::_M_start_thread(_State_ptr state, void (*depend)())
{
+ // Make sure it's not optimized out, not even with LTO.
+ asm ("" : : "rm" (depend));
+
if (!__gthread_active_p())
{
#if __cpp_exceptions
@@ -190,8 +193,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
void
- thread::_M_start_thread(__shared_base_type __b, void (*)())
+ thread::_M_start_thread(__shared_base_type __b, void (*depend)())
{
+ // Make sure it's not optimized out, not even with LTO.
+ asm ("" : : "rm" (depend));
+
auto ptr = __b.get();
// Create a reference cycle that will be broken in the new thread.
ptr->_M_this_ptr = std::move(__b);
--
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activist GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts. Ask me about <https://stallmansupport.org>