pjfanning opened a new pull request, #2785: URL: https://github.com/apache/pekko/pull/2785
part of #2730 akka/akka-core#31864 `ClusterShardingHealthCheck` would fail indefinitely if a new entity type added in a rolling update hadn't registered with the coordinator yet, stalling the rollout. The fix disables the health check (returns `true`) after a configurable duration once the cluster member is `Up`. ## Changes - **`ClusterShardingHealthCheckSettings`**: Added `disableAfter: FiniteDuration` parameter; deprecated 2-arg constructor kept for binary compatibility - **`ClusterShardingHealthCheck`**: Tracks `startedTimestamp` when member first becomes `Up`; bypasses shard region checks once `disabled-after` has elapsed - **`reference.conf`**: New `disabled-after = 10s` under `pekko.cluster.sharding.healthcheck` - **Tests**: Updated existing tests to 3-arg constructor; added `"always pass after disabled-after"` test using a real cluster provider to verify the clock only starts post-`MemberStatus.Up` - **Docs**: Added note explaining the rolling-update rationale ## Behaviour ``` disabled-after = 10s # default; set to infinite to restore old behaviour ``` Clock starts only after `selfMember.status` leaves `Joining`/`Removed`. Checks before that are unaffected. Once the deadline is exceeded, the check short-circuits to `true` without contacting the shard region. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
