If a straight-up change solves a constant headache, as you suggest,
Alberto, and as Blake concurs, that sounds like the way to go.
Why introduce a new option or property if the user will always prefer one
behavior over the other? (And from a docs perspective, who needs another
optional property, anyway?)

On Thu, Sep 17, 2020 at 10:32 AM Blake Bender <bbl...@vmware.com> wrote:

> Given that attempts to retrieve metadata after the C++ cache is closed are
> a constant headache for Geode Native development, I am generally in favor
> of anything that potentially reduces the number of times/places this
> happens.  If we've failed the handshake, it's very unlikely things will
> correct themselves without outside intervention, so this fix is probably
> goodness.  I'd go ahead and submit a PR when you think it's solid.
>
> Thanks,
>
> Blake
>
>
> On 9/17/20, 9:36 AM, "Dave Barnes" <dbar...@apache.org> wrote:
>
>     Alberto,
>     Are there cases in which one or two timeouts are followed by a
> successful
>     retry? Or does one timeout *always* end with more timeouts and,
> ultimately,
>     an IO error?
>     If timeouts can sometimes be followed by successful retries, and
> re-trying
>     is the current default behavior, then I agree that introducing a
> setting
>     that effectively eliminates re-tries should be the developer's choice.
>     In that case, I suggest that the option should not be a low-level
> choice of
>     "handle the metadata in a way that eliminates retries" but should be
> higher
>     level, like "when attempting to connect, try only once, instead of
>     re-trying (the default behavior)."
>     -Dave
>
>     On Thu, Sep 17, 2020 at 7:42 AM Alberto Bustamante Reyes
>     <alberto.bustamante.re...@est.tech> wrote:
>
>     > Hi geode-dev,
>     >
>     > I have a question about the c++ client.
>     >
>     > Some months ago we merged GEODE-8231 to solve a problem we observed
>     > regarding the native client was trying to connect to stopped server.
>     > GEODE-8231 solution consists on remove the client metadata when an
> "IO
>     > error in handshake" exception is received. This fix solved most of
> our
>     > problems, but it has been observed that sometimes when a server is
> stopped
>     > the errors received in the client are not the same and this "IO
> error in
>     > handshake" takes up to a minute to appear. So during that time, the
> client
>     > is still trying to connect to the offline server.
>     >
>     > As the error received during that time is "timeout in handshake", we
> have
>     > tested modyfing the solution of GEODE-8213 to make the client to
> remove the
>     > metadata once a timeout error is received (here is a draft with the
> code:
>     >
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode-native%2Fpull%2F651&amp;data=02%7C01%7Cbblake%40vmware.com%7Cee9cfd61173047c7247808d85b27c3c8%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637359573636742453&amp;sdata=FUhQIAalNs0PK4vFvgnVZPV55cLPykD2cvDRwgRrNj0%3D&amp;reserved=0).
> With this change in
>     > place, the behavior is ok.
>     >
>     >
>     > But I would like to check your opinion about this check, because
> this will
>     > cause that a single timeout will cause the removal of the client
> metadata,
>     > which maybe its not the best solution. I thought about different
>     > alternatives:
>     >
>     > - Wait until a given number of timeouts in a row have been received
> from
>     > the same server to remove the metadata
>     > - Make this "remove-metadata-after-timeout" something optional that
> could
>     > be configured if needed
>     >
>     > As this will misalign the behavior of Java and C++ clients, making
> this an
>     > optional configuration will be more appropriate, to keep the default
> c++
>     > client behavior as the Java client.
>     >
>     > BR/
>     >
>     > Alberto B.
>     >
>
>

Reply via email to