This is an automated email from the ASF dual-hosted git repository. dlmarion pushed a commit to branch 2.1 in repository https://gitbox.apache.org/repos/asf/accumulo.git
The following commit(s) were added to refs/heads/2.1 by this push: new 33894e6997 Fixed race condition in TServerUtils exposed by TServerUtilsTest (#4413) 33894e6997 is described below commit 33894e69979afc70efca448ea31fb29ac73288f3 Author: Dave Marion <dlmar...@apache.org> AuthorDate: Fri Mar 22 08:14:21 2024 -0400 Fixed race condition in TServerUtils exposed by TServerUtilsTest (#4413) There are several tests in TServerUtilsTest that have the form: ``` TServer server = null; try { ServerAddress address = startServer(); server = address.getServer(); } finally { if (server != null) { server.stop(); } } ``` The TServerUtilsTest.startServer method calls TServerUtils.startServer which ends up creating a Thread that calls TServer.serve(). When TServer is a TThreadPoolServer, the serve method calls preServe first, which sets the internal boolean variable `stopped_` to false, and then calls execute which will loop while `stopped_` is false. In the case where the Thread created by TServerUtils.startServer is not started right away, then it's possible that the test method will call stop (setting `stopped_` to true) before the TServer.serve method calls preServe (setting `stopped_` back to false) resulting in the Thread being in an endless loop. This can be seen by running TServerTestUtils, where the output contains many lines like: ``` [server.TThreadPoolServer] WARN : Transport error occurred during acceptance of message org.apache.thrift.transport.TTransportException: No underlying server socket. at org.apache.thrift.transport.TServerSocket.accept(TServerSocket.java:113) ~[libthrift-0.17.0.jar:0.17.0] at org.apache.thrift.transport.TServerSocket.accept(TServerSocket.java:31) ~[libthrift-0.17.0.jar:0.17.0] at org.apache.thrift.server.TThreadPoolServer.execute(TThreadPoolServer.java:162) ~[libthrift-0.17.0.jar:0.17.0] at org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:148) ~[libthrift-0.17.0.jar:0.17.0] at org.apache.accumulo.server.rpc.TServerUtils.lambda$startTServer$9(TServerUtils.java:654) ~[classes/:?] at org.apache.accumulo.core.trace.TraceWrappedRunnable.run(TraceWrappedRunnable.java:52) [accumulo-core-2.1.3-SNAPSHOT.jar:2.1.3-SNAPSHOT] at java.base/java.lang.Thread.run(Thread.java:840) [?:?] ``` Because the surefire configuration reuses JVM forks these Threads persist for the duration of the unit tests in server base and pollute every test output file after TServerUtilsTest is executed. The surefire forkCount is set to `1C`, so the volume of output in the logs is dependent on the number of forks. Co-authored-by: Keith Turner <ktur...@apache.org> --- .../main/java/org/apache/accumulo/server/rpc/TServerUtils.java | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java b/server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java index 8ae472cf8b..9f37990926 100644 --- a/server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java +++ b/server/base/src/main/java/org/apache/accumulo/server/rpc/TServerUtils.java @@ -48,6 +48,7 @@ import org.apache.accumulo.core.rpc.UGIAssumingTransportFactory; import org.apache.accumulo.core.util.Halt; import org.apache.accumulo.core.util.HostAndPort; import org.apache.accumulo.core.util.Pair; +import org.apache.accumulo.core.util.UtilWaitThread; import org.apache.accumulo.core.util.threads.ThreadPools; import org.apache.accumulo.core.util.threads.Threads; import org.apache.accumulo.server.ServerContext; @@ -71,6 +72,7 @@ import org.apache.thrift.transport.TTransportFactory; import org.slf4j.Logger; import org.slf4j.LoggerFactory; +import com.google.common.base.Preconditions; import com.google.common.primitives.Ints; /** @@ -657,6 +659,13 @@ public class TServerUtils { } }).start(); + while (!finalServer.isServing()) { + // Wait for the thread to start and for the TServer to start + // serving events + UtilWaitThread.sleep(10); + Preconditions.checkState(!finalServer.getShouldStop()); + } + // check for the special "bind to everything address" if (serverAddress.address.getHost().equals("0.0.0.0")) { // can't get the address from the bind, so we'll do our best to invent our hostname