[ 
https://issues.apache.org/jira/browse/IGNITE-28395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexandr Shapkin reassigned IGNITE-28395:
-----------------------------------------

    Assignee: Denis Chudov

> Lease updater accumulates concurrent in-flight invocations causing constant 
> CAS failures
> ----------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28395
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28395
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Denis Chudov
>            Assignee: Denis Chudov
>            Priority: Major
>              Labels: ignite-3
>
> Lease Updater fires invoke to Meta storage asynchronously every 500ms without 
> waiting for the previous one to complete. This causes multiple concurrent 
> invocations with the same expected lease state — only one wins the CAS, the 
> rest fail with
> {code:java}
> Lease update invocation failed because of outdated lease data on this 
> node{code}
> As a result, roughly once per minute the lease expires before renewal.
> Simply reading fresh data from storage before each invoke does not help: 
> previous invocations are already in-flight and will complete after the read, 
> making the freshly-read state outdated by the time the new invoke reaches 
> storage.
> *Fix*
> Track in-flight invoke as a future. On each tick, if the previous future is 
> not complete — block with `future.get(timeout)` before reading from lease 
> tracker and firing the next invoke. This guarantees at most one in-flight 
> invoke at any time and that the lease state is read only after the previous 
> update has landed. Timeout should be around leaseInterval/2 - after that, the 
> leases most likely will expire anyway.
> Also, there may be lag between future completion and lease map update in 
> lease tracker, so lease map still may be stale. We can return written leases 
> from successful invoke itself. In the case of invoke failure, the map from 
> lease tracker should be used.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to