this table was actually leveled compaction before, just changed it to size tiered yesterday while researching this.
On Thu, Oct 3, 2019 at 4:31 PM Patrick Lee <patrickclee0...@gmail.com> wrote: > its not really time series data. and it's not updated very often, it > would have some updates but pretty infrequent. this thing should be super > fast, on avg it's like 1 to 2ms p99 currently but if they double - triple > the traffic on that table latencies go upward to 20ms to 50ms.. the only > odd thing i see is just that there are constant read repairs that follow > the same traffic pattern on the reads, which shows constant writes on the > table (from the read repairs), which after read repair or just normal full > repairs (all full through reaper, never ran any incremental repair) i would > expect it to not have any mismatches. the other 5 tables they use on the > cluster can have the same level traffic all very simple select from table > by partition key which returns a single record > > On Thu, Oct 3, 2019 at 4:21 PM John Belliveau <belliveau.j...@gmail.com> > wrote: > >> Hi Patrick, >> >> >> >> Is this time series data? If so, I have run into issues with repair on >> time series data using the SizeTieredCompactionStrategy. I have had >> better luck using the TimeWindowCompactionStrategy. >> >> >> >> John >> >> >> >> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for >> Windows 10 >> >> >> >> *From: *Patrick Lee <patrickclee0...@gmail.com> >> *Sent: *Thursday, October 3, 2019 5:14 PM >> *To: *user@cassandra.apache.org >> *Subject: *Constant blocking read repair for such a tiny table >> >> >> >> I have a cluster that is running 3.11.4 ( was upgraded a while back from >> 2.1.16 ). what I see is a steady rate of read repair which is about 10% >> constantly on only this 1 table. Repairs have been run (actually several >> times). The table does not have a lot of writes to it so after repair, or >> even after a read repair I would expect it to be fine. the reason i'm >> having to dig into this so much is for the fact that under a much large >> traffic load than their normal traffic, latencies are higher than the app >> team wants >> >> >> >> I mean this thing is tiny, it's a 12x12 cluster but this 1 table is like >> 1GB per node on disk. >> >> >> >> the application team is doing reads at LOCAL_QUORUM and I can simulate >> this on that cluster by running a query using quorum and/or local_quorum >> and in the trace can see every time running the query it comes back with a >> DigestMismatchException no matter how many times I run it. that record >> hasn't been updated by the application for several months. >> >> >> >> repairs are scheduled and run every 7 days via reaper, recently in the >> past week this table has been repaired at least 3 times. every time there >> are mismatches and data streams back and forth but yet still a constant >> rate of read repairs. >> >> >> >> curious if anyone has any recommendations to look info further or have >> experienced anything like this? >> >> >> >> this node has been up for 24 hours.. this is the netstats for read repairs >> >> Mode: NORMAL >> Not sending any streams. >> Read Repair Statistics: >> Attempted: 7481 >> Mismatch (Blocking): 11425375 >> Mismatch (Background): 17 >> Pool Name Active Pending Completed Dropped >> Large messages n/a 0 1232 0 >> Small messages n/a 0 395903678 0 >> Gossip messages n/a 0 603746 0 >> >> >> >> example of the schema... some modifications have been made to reduce >> read_reapair and speculative_retry while troubleshooting.. >> >> CREATE TABLE keyspace.table1 ( >> >> item bigint, >> >> price int, >> >> start_date timestamp, >> >> end_date timestamp, >> >> created_date timestamp, >> >> cost decimal, >> >> list decimal, >> >> item_id int, >> >> modified_date timestamp, >> >> status int, >> >> PRIMARY KEY ((item, price), start_date, end_date) >> >> ) WITH CLUSTERING ORDER BY (start_date ASC, end_date ASC) >> >> AND read_repair_chance = 0.0 >> >> AND dclocal_read_repair_chance = 0.0 >> >> AND gc_grace_seconds = 864000 >> >> AND bloom_filter_fp_chance = 0.01 >> >> AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'NONE' } >> >> AND comment = '' >> >> AND compaction = { 'class' : >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold' : 32, 'min_threshold' : 4 } >> >> AND compression = { 'chunk_length_in_kb' : 4, 'class' : >> 'org.apache.cassandra.io.compress.LZ4Compressor' } >> >> AND default_time_to_live = 0 >> >> AND speculative_retry = 'NONE' >> >> AND min_index_interval = 128 >> >> AND max_index_interval = 2048 >> >> AND crc_check_chance = 1.0 >> >> AND cdc = false >> >> AND memtable_flush_period_in_ms = 0; >> >> >> >