Alexey Serbin has uploaded this change for review. (
http://gerrit.cloudera.org:8080/23229
Change subject: [mini-cluster] explicitly shut down Messenger
......................................................................
[mini-cluster] explicitly shut down Messenger
Since many components keep a reference to Messenger objects through
shared pointers, it's not trivial to track their lifecycle given the
resulting 'shared ownership' semantics. Because of that, it makes
sense to explicitly shut down ExternalMinicluster's messenger in the
ExternalMiniCluster::Shutdown() method to prevent unexpected activity
while winding down the test harness.
Below is a redacted snippet from the log produced by the
MasterReplicationAndRpcSizeLimitTest.TabletReports scenario, running
master_replication-itest built with TSAN. The TSAN detected a data
race between the main thread that was destructing static objects
in the global namespace after exiting the 'main()' function and RPC
connection negotiation task still running on a Messenger reactor thread.
Since all Kudu servers' messengers are explicitly shut down in
KuduServer::Shutdown() and ServerBase::ShutdownImpl() that must
had been called already because the main thread was out of the 'main()'
function, and MasterReplicationAndRpcSizeLimitTest::client_ KuduClient
instance isn't active in the TabletReports scenario and TestWorkload
has completed successfully with no pending RPCs running, the only
Messenger instance that might be running such an activity seemed to be
ExternalMiniCluster::messenger_. I'm not sure what exact RPC
it was running, but I guess it might be ExternalDaemon::CheckForLeaks()
that's run only in TSAN builds.
==================
WARNING: ThreadSanitizer: data race (pid=16235)
Write of size 8 at 0x7b0800003cc8 by main thread:
#0 operator delete(void*) .../tsan_new_delete.cpp:126
(master_replication-itest+0x345c79)
...
#9 std::__1::set<kudu::rpc::RpcFeatureFlag, ...>::~set() ...
(libkrpc.so+0x133129)
#10 cxa_at_exit_wrapper(void*) .../tsan_interceptors_posix.cpp:394
(master_replication-itest+0x2c59ff)
Previous read of size 8 at 0x7b0800003cc8 by thread T361:
#1 std::__1::__tree_const_iterator<kudu::rpc::RpcFeatureFlag,
...>::operator++() .../__tree:929:11 (libkrpc.so+0x133df2)
...
#5 kudu::rpc::ClientNegotiation::SendNegotiate()
.../client_negotiation.cc:327:20 (libkrpc.so+0x12b3b6)
...
#9 kudu::rpc::ReactorThread::StartConnectionNegotiation()...
.../reactor.cc:631:3 (libkrpc.so+0x19f46c)
...
#17 kudu::ThreadPool::CreateThread()... .../threadpool.cc:815:48
(libkudu_util.so+0x461921)
...
As if synchronized via sleep:
...
#2 kudu::SleepFor(kudu::MonoDelta const&) ...
#3 kudu::Subprocess::KillAndWait(int) .../subprocess.cc:692:7
(libkudu_util.so+0x434720)
#4 kudu::clock::MiniChronyd::Stop() .../mini_chronyd.cc:251:16
(libmini_chronyd.so+0x15170)
#5 kudu::clock::MiniChronyd::~MiniChronyd() .../mini_chronyd.cc:166:5
(libmini_chronyd.so+0x14e6e)
...
#17 kudu::cluster::ExternalMiniCluster::~ExternalMiniCluster()
.../external_mini_cluster.cc:173:45 (libmini_cluster.so+0x846b9)
...
#22
kudu::master::MasterReplicationAndRpcSizeLimitTest_TabletReports_Test::~MasterReplicationAndRpcSizeLimitTest_TabletReports_Test()
.../master_replication-itest.cc:536:1 (master_replication-itest+0x351169)
#23 testing::Test::DeleteSelf_() ...
...
#31 testing::UnitTest::Run() .../gtest.cc:5444:10
(libgtest.so.1.12.1+0x58dcc)
#32 RUN_ALL_TESTS() .../gtest.h:2293:73
(master_replication-itest+0x36896b)
#33 main .../test_main.cc:156:10 (master_replication-itest+0x3675ac)
Thread T361 'client-negotiat' (tid=22269, finished) created by thread T329
at:
...
#3 kudu::ThreadPool::CreateThread(...) ...
#4 kudu::ThreadPool::DoSubmit(...) ...
#5 kudu::ThreadPool::Submit(...) ...
#6 kudu::rpc::ReactorThread::StartConnectionNegotiation() ...
#7 kudu::rpc::ReactorThread::FindOrStartConnection(...) ...
#8 kudu::rpc::ReactorThread::AssignOutboundCall(...) ...
...
#13 kudu::rpc::ReactorThread::InvokePendingCb(...) ...
#14 ev_run ...
#15 ev::loop_ref::run(int) ...
#16 kudu::rpc::ReactorThread::RunThread() .../reactor.cc:510:9
(libkrpc.so+0x19b1b5)
...
SUMMARY: ThreadSanitizer: data race .../tsan_new_delete.cpp:126 in operator
delete(void*)
==================
Change-Id: I4712afe598ab63efaf56729d47a07ce4247a80db
---
M src/kudu/mini-cluster/external_mini_cluster.cc
1 file changed, 10 insertions(+), 0 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/29/23229/1
--
To view, visit http://gerrit.cloudera.org:8080/23229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4712afe598ab63efaf56729d47a07ce4247a80db
Gerrit-Change-Number: 23229
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <[email protected]>