Hi,
PFB results for same. Numbers are scary here.
[root@WA-CASSDB2 bin]# ./nodetool compactionstats
pending tasks: 137
compaction type keyspace table completed
total unit progress
Compaction system hints 5762711108
837522028005 bytes 0.69%
Compaction walletkeyspace user_txn_history_v2 101477894
4722068388 bytes 2.15%
Compaction walletkeyspace user_txn_history_v2 1511866634
753221762663 bytes 0.20%
Compaction walletkeyspace user_txn_history_v2 3664734135
18605501268 bytes 19.70%
Active compaction remaining time : *26h32m28s*
On 11 May 2017 at 23:15, Oskar Kjellin <[email protected]> wrote:
> What does nodetool compactionstats show?
>
> I meant compaction throttling. nodetool getcompactionthrougput
>
>
> On 11 May 2017, at 19:41, varun saluja <[email protected]> wrote:
>
> Hi Oskar,
>
> Thanks for response.
>
> Yes, could see lot of threads for compaction. Actually we are loading
> around 400GB data per node on 3 node cassandra cluster.
> Throttling was set to write around 7k TPS per node. Job ran fine for 2
> days and then, we start getting Mutation drops , longer GC and very high
> load on system.
>
> System log reports:
> Enqueuing flush of compactions_in_progress: 1156 (0%) on-heap, 1132 (0%)
> off-heap
>
> The job was stopped 12 hours back. But, still these failures can be seen.
> Can you Please let me know how shall i proceed further. If possible, Please
> suggest some parameters for high write intensive jobs.
>
>
> Regards,
> Varun Saluja
>
>
> On 11 May 2017 at 23:01, Oskar Kjellin <[email protected]> wrote:
>
>> Do you have a lot of compactions going on? It sounds like you might've
>> built up a huge backlog. Is your throttling configured properly?
>>
>> > On 11 May 2017, at 18:50, varun saluja <[email protected]> wrote:
>> >
>> > Hi Experts,
>> >
>> > Seeking your help on a production issue. We were running high write
>> intensive job on our 3 node cassandra cluster V 2.1.7.
>> >
>> > TPS on nodes were high. Job ran for more than 2 days and thereafter,
>> loadavg on 1 of the node increased to very high number like loadavg : 29.
>> >
>> > System log reports:
>> >
>> > INFO [ScheduledTasks:1] 2017-05-11 22:11:04,466
>> MessagingService.java:888 - 839 MUTATION messages dropped in last 5000ms
>> > INFO [ScheduledTasks:1] 2017-05-11 22:11:04,466
>> MessagingService.java:888 - 2 READ messages dropped in last 5000ms
>> > INFO [ScheduledTasks:1] 2017-05-11 22:11:04,466
>> MessagingService.java:888 - 1 REQUEST_RESPONSE messages dropped in last
>> 5000ms
>> >
>> > The job was stopped due to heavy load. But sill after 12 hours , we can
>> see mutation drops messages and sudden increase on avgload
>> >
>> > Are these hintedhandoff mutations? Can we stop these.
>> > Strangely this behaviour is seen only on 2 nodes. Node 1 does not show
>> any load or any such activity.
>> >
>> > Due to heavy load and GC , there are intermittent gossip failures among
>> node. Can you someone Please help.
>> >
>> > PS: Load job was stopped on cluster. Everything ran fine for few hours
>> and and Later issue started again like mutation messages drops.
>> >
>> > Thanks and Regards,
>> > Varun Saluja
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [email protected]
>> > For additional commands, e-mail: [email protected]
>> >
>>
>
>