Re: timeouts on counter tables

2017-09-04 Thread Rudi Bruchez
I'm going to try different options. Do any of you have some experience with tweaking one of those conf parameters to improve read throughput, especially in case of counter tables ? 1/ using SSD : trickle_fsync: true trickle_fsync_interval_in_kb: 1024 2/ concurrent_compactors to the number of

Re: timeouts on counter tables

2017-09-04 Thread Rudi Bruchez
It can happen on any of the nodes. We can have a large number of pending on ReadStage and CounterMutationStage. We'll try to increase concurrent_counter_writes to see how it changes things Likely. I believe counter mutations are a tad more expensive than a normal mutation. If you're doing a lo

Re: timeouts on counter tables

2017-09-04 Thread kurt greaves
Likely. I believe counter mutations are a tad more expensive than a normal mutation. If you're doing a lot of counter updates that probably doesn't help. Regardless, high amounts of pending reads/mutations is generally not good and indicates the node being overloaded. Are you just seeing this on th

Re: timeouts on counter tables

2017-09-03 Thread Rudi Bruchez
Le 30/08/2017 à 05:33, Erick Ramirez a écrit : Is it possible at all that you may have a data hotspot if it's not hardware-related? It does not seem so, The partition key seems well distributed and the queries update different keys. We have dropped counter_mutation messages in the log : CO

Re: timeouts on counter tables

2017-09-03 Thread Rudi Bruchez
Le 28/08/2017 à 03:30, kurt greaves a écrit : If every node is a replica it sounds like you've got hardware issues. Have you compared iostat to the "normal" nodes? I assume there is nothing different in the logs on this one node? Also sanity check, you are using DCAwareRoundRobinPolicy? ​ Tha

Re: timeouts on counter tables

2017-08-29 Thread Erick Ramirez
Is it possible at all that you may have a data hotspot if it's not hardware-related? On Mon, Aug 28, 2017 at 11:30 AM, kurt greaves wrote: > If every node is a replica it sounds like you've got hardware issues. Have > you compared iostat to the "normal" nodes? I assume there is nothing > differe

Re: timeouts on counter tables

2017-08-27 Thread kurt greaves
If every node is a replica it sounds like you've got hardware issues. Have you compared iostat to the "normal" nodes? I assume there is nothing different in the logs on this one node? Also sanity check, you are using DCAwareRoundRobinPolicy? ​

Re: timeouts on counter tables

2017-08-27 Thread Rudi Bruchez
Le 28/08/2017 à 00:11, kurt greaves a écrit : What is your RF? Also, as a side note RAID 1 shouldn't be necessary if you have >1 RF and would give you worse performance 2 + 1 on a backup single node. Consistency one. You're right about RAID 1, if the disk perf is the problem, that might be a

Re: timeouts on counter tables

2017-08-27 Thread kurt greaves
What is your RF? Also, as a side note RAID 1 shouldn't be necessary if you have >1 RF and would give you worse performance

timeouts on counter tables

2017-08-27 Thread Rudi Bruchez
Hello, On a 3 nodes cluster (nodes : 48 procs, 32 Go RAM, SSD), I've timeouts on counter table UPDATEs. One node is specifically slow, generating timeouts. IO bound. iotop shows consistently about 300 Mb/s reads, and writes are around 100 ko/s, changing. The keys seem well distributed. The a