uschindler commented on PR #12535: URL: https://github.com/apache/lucene/pull/12535#issuecomment-1704953411
Hi, I checked the code again: I think the timeout was added to make sure the server ends if one of the clients does not start at all or fails to start. The server is written fine, it waits for exactly `n` clients to start. If one of the client will crush before connecting the server will hang forever, so there should really be a timeout! If 30 seconds is too short, I think we have to dig into the code and figure out why the client did not start. Maybe there's startup logging missing or we do not print it out. So in my opinion, we should better figure out why the client did not start. Maybe becuase something ran out of file handles? So my ideas: - +1 to remove the SO_REUSEADDR - -1 to remove timeout completely as it will cause the test server to hang if any of the clients does not successfully start We should maybe change the Lucene Unit Test that uses the server class to better log the stdout/stderr of clients and have some more sanity checks. The 30 seconds timeout should be fine. If we figure out that in reality the clients are too slow to start, we can raise the timeout. But actually it is very unlikely that the very first client takes so long. As the server just accepts `n` clients it waits 30 seconds *per* client. Once the firts client started it is also unlikly that later ones take aditional 30 seconds. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org