Internode speculative retry is on by default with p99
The client side retry varies by driver / client
> On Oct 17, 2021, at 1:59 PM, S G wrote:
>
>
>
> "The harder thing to solve is a bad coordinator node slowing down all reads
> coordinated by that node"
> I think this is the root of the
Also, for the percentile based speculative retry, how big of a time-period
is used to calculate the percentile?
If it is only a few seconds, then the latency will increase very quickly
when server performance degrades.
But if it is upto a few minutes (or it is configurable), then its
percentile wil
"The harder thing to solve is a bad coordinator node slowing down all reads
coordinated by that node"
I think this is the root of the problem and since all nodes act as
coordinator nodes, so it guaranteed that if any 1 node slows down (High GC,
Segment Merging etc), it will slow down 1/N queries in
Some random notes, not necessarily going to help you, but:
- You probably have vnodes enable, which means one bad node is PROBABLY a
replica of almost every other node, so the fanout here is worse than it
should be, and
- You probably have speculative retry on the table set to a percentile. As
the
Hello,
We have frequently seen that a single bad node running slow can affect the
latencies of the entire cluster (especially for queries where the slow node
was acting as a coordinator).
Is there any suggestion to avoid this behavior?
Like something on the client side to not query that bad nod