I think your problem is likely not the "stuck" hints, but the write
requests in them.
The reason those write requests ended up in the hint file is because
they have failed before. They are likely to fail again when they are
retried if the failure was caused by the write requests themselves
instead of some network issues or nodes temporary overloaded by other
queries.
On 15/11/2021 10:43, Paul Chandler wrote:
Hi all
We keep having a problem with hint files on one of our Cassandra nodes
(v 3.11.6 ), there keeps being the following error messages repeated
for same file.
INFO [HintsDispatcher:25] 2021-11-02
08:55:29,830 HintsDispatchExecutor.java:289 - Finished hinted handoff
of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to
endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
INFO [HintsDispatcher:24] 2021-11-02
08:55:39,812 HintsDispatchExecutor.java:289 - Finished hinted handoff
of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to
endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
INFO [HintsDispatcher:25] 2021-11-02
08:55:49,822 HintsDispatchExecutor.java:289 - Finished hinted handoff
of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to
endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
On the receiving node ( cassandra0 ) we see the CPU shoot up, this is
how notice we have a problem.
This has happened serval times with different files, and we find the
only way to stop this is to delete the offending hint files.
The cluster can be a bit overloaded, and this is what is causing the
hint files to be generated in the first place, we are working to get
that stopped, However the question I don’t know the answer to is what
causing this “partially” hint processing and how can we stop it happening?
Thanks
Paul