Alberto, Are there cases in which one or two timeouts are followed by a successful retry? Or does one timeout *always* end with more timeouts and, ultimately, an IO error? If timeouts can sometimes be followed by successful retries, and re-trying is the current default behavior, then I agree that introducing a setting that effectively eliminates re-tries should be the developer's choice. In that case, I suggest that the option should not be a low-level choice of "handle the metadata in a way that eliminates retries" but should be higher level, like "when attempting to connect, try only once, instead of re-trying (the default behavior)." -Dave
On Thu, Sep 17, 2020 at 7:42 AM Alberto Bustamante Reyes <alberto.bustamante.re...@est.tech> wrote: > Hi geode-dev, > > I have a question about the c++ client. > > Some months ago we merged GEODE-8231 to solve a problem we observed > regarding the native client was trying to connect to stopped server. > GEODE-8231 solution consists on remove the client metadata when an "IO > error in handshake" exception is received. This fix solved most of our > problems, but it has been observed that sometimes when a server is stopped > the errors received in the client are not the same and this "IO error in > handshake" takes up to a minute to appear. So during that time, the client > is still trying to connect to the offline server. > > As the error received during that time is "timeout in handshake", we have > tested modyfing the solution of GEODE-8213 to make the client to remove the > metadata once a timeout error is received (here is a draft with the code: > https://github.com/apache/geode-native/pull/651). With this change in > place, the behavior is ok. > > > But I would like to check your opinion about this check, because this will > cause that a single timeout will cause the removal of the client metadata, > which maybe its not the best solution. I thought about different > alternatives: > > - Wait until a given number of timeouts in a row have been received from > the same server to remove the metadata > - Make this "remove-metadata-after-timeout" something optional that could > be configured if needed > > As this will misalign the behavior of Java and C++ clients, making this an > optional configuration will be more appropriate, to keep the default c++ > client behavior as the Java client. > > BR/ > > Alberto B. >