Re: [DISCUSS] Improvements on client function execution API

Jacob Barrett Wed, 11 Sep 2019 23:15:40 -0700

+1

I echo Dan’s comments as well.


Thanks for tackling this.

-jake


> On Sep 11, 2019, at 2:36 PM, Dan Smith <[email protected]> wrote:
> 
> +1 - Ok, I think I've come around to option (a). We can go head and add a
> new execute(timeout, TimeUnit) method to the java API that is blocking. We
> can leave the existing execute() method alone, except for documenting what
> it is doing.
> 
> I would like implement execute(timeout,  TimeUnit) on the server side as
> well. Since this Execution class is shared between both client and server
> APIs, it would be unfortunate to have a method on Execution that simply
> doesn't work on the server side.
> 
> -Dan
> 
> 
>> On Thu, Sep 5, 2019 at 9:25 AM Alberto Gomez <[email protected]> wrote:
>> 
>> Hi all,
>> 
>> First of all, thanks a lot Dan and Jacob for your feedback.
>> 
>> As we are getting close to the deadline I am adding here some conclusions
>> and a refined proposal in order to get some more feedback and if possible
>> some voting on the two alternatives proposed (or any other in between if
>> you feel any of them is lacking something).
>> 
>> I also add some draft code to try to clarify a bit the more complex of the
>> alternatives.
>> 
>> 
>> Proposal summary (needs a decision on which option to implement):
>> 
>> -------------------------------------------------------------------------------------------
>> 
>> In order to make the API more coherent two alternatives are proposed:
>> 
>> a) Remove the timeout from the ResultCollector::getResult() / document
>> that the timeout has no effect, taking into account that
>> Execution::execute() is always blocking.
>> Additionally we could add the timeout parameter to the
>> Execution::execute() method of the Java API in order to align it with the
>> native client APIs. This timeout would not be the read timeout on the
>> socket but a timeout for the execution of the operation.
>> 
>> b) Change the implementation of the Execution::execute() method without
>> timeout to be non-blocking on both the Java and native APIs. This change
>> has backward compatibility implications, would probably bring some
>> performance decrease and could pose some difficulties in the implementation
>> on the C++ side (in the  handling of timed out operations that hold
>> resources).
>> 
>> 
>> The first option (a) is less risky and does not have impacts regarding
>> backward compatibility and performance.
>> 
>> The second one (b) is the preferred alternative in terms of the expected
>> behavior from the users of the API. This option is more complex to
>> implement and as mentioned above has performance and backward compatibility
>> issues not easy to be solved.
>> 
>> Following is a draft version of the implementation of b) on the Java
>> client:
>> 
>> https://github.com/Nordix/geode/commit/507a795e34c6083c129bda7e976b9223d1a893da
>> 
>> Following is a draft version of the implementation of b) on the C++ native
>> client:
>> 
>> https://github.com/apache/geode-native/commit/a03a56f229bb8d75ee71044cf6196df07f43150d
>> 
>> Note that the above implementation of b) in the C++ client implies that
>> the Execution object returned by the FunctionService cannot be destroyed
>> until the thread executing the function asynchronously has finished. If the
>> function times out, the Execution object must be kept until the thread
>> finishes.
>> 
>> 
>> Other considerations
>> -------------------------
>> 
>> * Currently, in the function execution Java client there is not a
>> possibility to set a timeout for the execution of functions. The closest to
>> this is the read timeout that may be set globally for function executions
>> but this is not really a timeout for operations.
>> 
>> * Even if the API for function execution is the same on clients and
>> servers, the implementation is different between them. On the clients, the
>> execute() methods are blocking while on the servers it is non-blocking and
>> the invoker of the function blocks on the getResult() method of the
>> ResultCollector returned by the execute() method.
>> Even if having both blocking and non-blocking implementation of execute()
>> in both clients and servers sounds desirable from the point of view of
>> orthogonality, this  could bring complications in terms of backward
>> compatibility. Besides, a need for a blocking version of function execution
>> has not been found.
>> 
>> -Alberto G.
>> 
>> On 29/8/19 16:48, Alberto Gomez wrote:
>> 
>> Sorry, some corrections on my comments after revisiting the native
>> client code.
>> 
>> When I said that the timeout used in the execution() method (by means of
>> a system property) was to set a read timeout on the socket, I was only
>> talking about the Java client. In the case of the native clients, the
>> timeout set in the execute() method is not translated into a socket
>> timeout but it is the time to wait for the operation to complete, i.e.,
>> to get all the results back.
>> 
>> Things being so, I would change my proposal to:
>> 
>> - Change the implementation of execute() on both Java and native clients
>> to be non-blocking (having the blocking/non-blocking behavior
>> configurable in the release this is introduced and leaving only the
>> non-blocking behavior in the next release).
>> 
>> - Either remove the execute() with timeout methods in the native clients
>> (with a deprecation release) or implement the execute(timeout) method in
>> the Java client to be blocking (to work as the native client does
>> today). In case the method times out, the connection will not be closed.
>> If the operation times out due to the socket timeout (system property),
>> then the connection will be closed as it is now done in the Java client.
>> 
>> - Do not implement the blocking execute(timeout) method on the server
>> and leave the current execute() implementation on the server as it is
>> (non-blocking)
>> 
>> Does this make sense?
>> 
>> -Alberto
>> 
>> On 29/8/19 12:56, Alberto Gómez wrote:
>> 
>> 
>> Hi Dan,
>> 
>> Discussing these matters by e-mail is getting tricky.
>> 
>> Let's see if I understand you correctly and also if I am being clear
>> enough.
>> 
>> Please, see my comments inline.
>> 
>> On 29/8/19 0:49, Dan Smith wrote:
>> 
>> 
>> Sorry for the slow response, I've been trying to decide what I think
>> is the
>> right approach here.
>> 
>> For (1) - conceptually, I don't have a problem with having both blocking
>> and non blocking methods on Execution. So adding blocking versions of
>> execute() with a timeout seems ok. But I do think if we add them we
>> need to
>> implement them on both the client and the server to behave the same way.
>> That shouldn't be too hard on the server since execute(timeout) can just
>> call getResult(timeout) internally.
>> 
>> 
>> 
>> We have to take into account that, currently, the timeout in execute()
>> is not the same thing as the timeout in getResult().
>> 
>> On the one hand, the timeout set in execute() (via System property in
>> the Java client, and with a parameter in the native client) sets a
>> readtimeout on the socket which just means that if nothing is read
>> from the socket after sending the request to the server for the given
>> timeout, the corresponding exception will be thrown. It looks to me
>> more like a protection against possible communication failures rather
>> than a mechanism to decide if results took too long to be provided. So
>> I would not link the presence of the timeout parameter in the method
>> to the nature of the method (blocking or non-blocking). I think we
>> could have this read timeout set and at the same time keep that method
>> as non-blocking.
>> 
>> On the other hand, the timeout set in getResult() is a timeout to wait
>> for all the results to be received from the moment the method is invoked.
>> 
>> Therefore, I would not implement the blocking version of execute() on
>> the server by calling getResult() at the end.
>> 
>> Apart from that, I doubt if it would make sense to set this
>> readTimeout in the execute() methods from servers given that the
>> communication is very different to the one done with clients. I also
>> doubt that anyone would be interested in the blocking version of
>> execute() on the server.
>> 
>> My proposal was to add the readtimeout to the execute() methods in the
>> Java client in order to align the Java client and the native client.
>> This change would be independent to the decision we make regarding the
>> change of execute() to non-blocking. To achieve this alignment,
>> alternatively, we could remove the timeout parameter in execute() from
>> the native clients and have it as a global property for the client to
>> be set by whatever mechanism available as it is done in the Java
>> client today.
>> 
>> Were you proposing that the execute() methods with timeout were
>> blocking and the ones without timeout non-blocking? Not sure if this
>> is something you meant.
>> 
>> 
>> 
>> 
>> For (2) - Although I think the original authors of this API probably did
>> intend for execute() to be non-blocking, the fact is that it does
>> block on
>> the client and most users are probably calling execute from a client.
>> So I
>> do agree we probably shouldn't change the behavior at this point.
>> Perhaps
>> we can just clearly document the current behavior of execute() as
>> part of
>> adding these new methods. Going forward we can add new methods to
>> Execution
>> that are clearly non-blocking (submit?, invoke?) and implement them
>> consistently on *both* the client in the server, but that doesn't
>> have to
>> be in the scope of this proposal.
>> 
>> 
>> 
>> The problem I see with adding new non-blocking methods (new/submit...)
>> is that it would be a solution for the current users of the client
>> regarding backwards compatibility. But, on the server side, we would
>> have to move the current logic of execute() which is non-blocking to
>> the new methods and change the current execute() behavior to blocking.
>> We would not impact the users of the client but we would impact the
>> users of the server.
>> 
>> Again, I would propose to aim at:
>> 
>> a) either leave execute() on the client as blocking
>> 
>> b) or change execute() on the client to be non-blocking but on the
>> Geode release this is introduced, have it configurable. The default
>> behavior would be blocking (deprecated behavior) but could be set to
>> non-blocking with a system property. On the next release, the blocking
>> behavior would be removed.
>> 
>> - Alberto G.
>> 
>> 
>> 
>> -Dan
>> 
>> On Fri, Aug 23, 2019 at 4:28 AM Alberto Gomez <[email protected]
>>> <mailto:[email protected]>
>> wrote:
>> 
>> 
>> 
>> Hi Jake,
>> 
>> Please, see my answers below.
>> 
>> On 22/8/19 21:16, Jacob Barrett wrote:
>> 
>> 
>> On Aug 21, 2019, at 8:49 AM, Alberto Gomez <[email protected]
>>> <mailto:[email protected]>
>> 
>> 
>> wrote:
>> 
>> 
>> 2. Timeout in ResultCollector::getResult() and Execution::execute()
>> 
>> 
>> blocking
>> 
>> 
>> Regarding the timeout in the ResultCollector::getResult() method
>> 
>> 
>> problem and the blocking/non-blocking confusion for
>> Execution::execute()
>> two alternatives are considered:
>> 
>> 
>> a) Remove the possibility of setting a timeout on the
>> 
>> 
>> ResultCollector::getResult() method on the client side as with the
>> current
>> client implementation it is useless. This could be done by removing the
>> method with the timeout parameter from the public API.
>> 
>> 
>> It would be advisable to make explicit in the documentation that the
>> 
>> 
>> getResult() method does not wait for results to arrive as that
>> should have
>> already been done in the Execution::execute() invocation.
>> 
>> 
>> This alternative is very simple and would keep things pretty much as
>> 
>> 
>> they are today.
>> 
>> 
>> To be honest I think approach would go against what a user “thinks” is
>> 
>> 
>> going on. Given that there hasn’t been a timeout on execute I
>> assumed it
>> was asynchronous and that the getResult blocked until timeout or
>> results
>> arrived. Typically these two calls were done back to back.
>> 
>> You are right if you look at the Java client. But if you look at the
>> native clients, the timeout is there, both in the C++ and the C# cases
>> which would indicate that it is a blocking call.
>> 
>> See
>> 
>> 
>> https://geode.apache.org/releases/latest/cppdocs/a00725.html#aa918a5e193745950e12ca4feb9c5d776
>> 
>> and
>> 
>> 
>> https://geode.apache.org/releases/latest/dotnetdocs/a00882.html#ae0a814049482ca424f89c13ab1099c3d
>> 
>> 
>> 
>> b) Transform the Execution::execute() method on the client side
>> into a
>> 
>> 
>> non-blocking method.
>> 
>> 
>> This alternative is more complex and requires changes in all the
>> 
>> 
>> clients. Apart from that it has implications on the public client
>> API it
>> requires moving the exceptions offered currently by the
>> Execution::execute() method to the ResultCollector::getResult() and new
>> threads will have to be managed.
>> 
>> 
>> I think this is more in line with what users expect is going on
>> based on
>> 
>> 
>> the current API, I know I have. If were are going to make any change I
>> think this is the one. I don’t think the behavior change is a
>> problem since
>> it's what is expected to be happening anyway.
>> 
>> 
>> An outline of a possible implementation for option b) would be:
>> 
>>   *   Instead of invoking the ServerRegionProxy::executeFunction()
>> 
>> 
>> directly as it is done today, create a Future that invokes this
>> method and
>> returns the resultCollector passed as parameter.
>> 
>> 
>> Do you really think we need to introduce Futures here into the API? I
>> 
>> 
>> feel like the ResultCollector acts as the Future. I don’t think any
>> change
>> needs to be made to the API in this regard. The ResultCollector
>> implementation would just simply block as implied by the api for the
>> timeout period. I would change the method to have units though and
>> deprecate the method without units.
>> 
>> I did not mean to introduce Futures in the API. My idea was to use Java
>> Futures internally so that the ResultCollector returned by the
>> getResult() method would wrap the Java Future with the ResultCollector
>> that would actually hold the result.
>> 
>> An alternative would be to leave the logic of blocking to each
>> implementation ResultCollector. In the case of the
>> DefaultResultCollector, we could use a CountDownLatch that would be
>> decremented when endResults() is called and that would make getResult()
>> block by using CountDown.await(...).
>> 
>> The advantage of using Futures internally is that the blocking logic
>> would not have to be implemented on every ResultCollector
>> implementation.
>> 
>> 
>> 
>> -Jake
>> 
>> 
>> 
>> 
>> - Alberto
>> 
>> 
>>

Re: [DISCUSS] Improvements on client function execution API

Reply via email to