Re: High Bloom filter false ratio

2016-02-23 Thread Jeff Jirsa
;user@cassandra.apache.org" Date: Tuesday, February 23, 2016 at 12:37 AM To: "user@cassandra.apache.org" Subject: Re: High Bloom filter false ratio Looks like that sstablemetadata is available in 2.2 , we are on 2.0.x do you know anything that will work on 2.0.x On Tue, Feb 23, 2016

RE: High Bloom filter false ratio

2016-02-23 Thread SEAN_R_DURITY
I see the sstablemetadata tool as far back as 1.2.19 (in tools/bin). Sean Durity From: Anishek Agarwal [mailto:anis...@gmail.com] Sent: Tuesday, February 23, 2016 3:37 AM To: user@cassandra.apache.org Subject: Re: High Bloom filter false ratio Looks like that sstablemetadata is available in 2.2

Re: High Bloom filter false ratio

2016-02-23 Thread Anishek Agarwal
list of sstables that you >> could feed to forceUserDefinedCompaction to join together to eliminate >> leftover waste. >> >> Your long ParNew times may be fixable by increasing the new gen size of >> your heap – the general guidance in cassandra-env.sh is out of date,

Re: High Bloom filter false ratio

2016-02-23 Thread Anishek Agarwal
by increasing the new gen size of > your heap – the general guidance in cassandra-env.sh is out of date, you > may want to reference CASSANDRA-8150 for “newer” advice ( > http://issues.apache.org/jira/browse/CASSANDRA-8150 ) > > - Jeff > > From: Anishek Agarwal > Reply-To: &

Re: High Bloom filter false ratio

2016-02-22 Thread Jeff Jirsa
-8150 ) - Jeff From: Anishek Agarwal Reply-To: "user@cassandra.apache.org" Date: Monday, February 22, 2016 at 8:33 PM To: "user@cassandra.apache.org" Subject: Re: High Bloom filter false ratio Hey Jeff, Thanks for the clarification, I did not explain

Re: High Bloom filter false ratio

2016-02-22 Thread Anishek Agarwal
t; Reply-To: "user@cassandra.apache.org" > Date: Sunday, February 21, 2016 at 11:13 PM > To: "user@cassandra.apache.org" > Subject: Re: High Bloom filter false ratio > > Hey guys, > > Just did some more digging ... looks like DTCS is not removing old data > completely,

Re: High Bloom filter false ratio

2016-02-22 Thread Jeff Jirsa
uot;user@cassandra.apache.org" Date: Sunday, February 21, 2016 at 11:13 PM To: "user@cassandra.apache.org" Subject: Re: High Bloom filter false ratio Hey guys, Just did some more digging ... looks like DTCS is not removing old data completely, I used sstable2json for one such table

Re: High Bloom filter false ratio

2016-02-22 Thread Christopher Bradford
Does every record in the SSTable have a "d" column? On Mon, Feb 22, 2016 at 2:14 AM Anishek Agarwal wrote: > Hey guys, > > Just did some more digging ... looks like DTCS is not removing old data > completely, I used sstable2json for one such table and saw old data there. > we have a value of 30

Re: High Bloom filter false ratio

2016-02-21 Thread Anishek Agarwal
Hey guys, Just did some more digging ... looks like DTCS is not removing old data completely, I used sstable2json for one such table and saw old data there. we have a value of 30 for max_stable_age_days for the table. One of the columns showed data as :["2015-12-10 11\\:03+0530:", "56690ea2", 14

Re: High Bloom filter false ratio

2016-02-21 Thread Anishek Agarwal
We are using DTCS have a 30 day window for them before they are cleaned up. I don't think with DTCS we can do anything about table sizing. Please do let me know if there are other ideas. On Sat, Feb 20, 2016 at 12:51 AM, Jaydeep Chovatia < chovatia.jayd...@gmail.com> wrote: > To me following thre

Re: High Bloom filter false ratio

2016-02-19 Thread Jaydeep Chovatia
To me following three looks on higher side: SSTable count: 1289 In order to reduce SSTable count see if you are compacting of not (If using STCS). Is it possible to change this to LCS? Number of keys (estimate): 345137664 (345M partition keys) I don't have any suggestion about reducing this unl

Re: High Bloom filter false ratio

2016-02-19 Thread Chris Lohfink
> > SSTable count: 1289 Thats seriously wrong and pretty horrific if this table is using size tiered compaction. Is compaction not keeping up or hung? May be whats affecting your BF FP ratio as well. On Thu, Feb 18, 2016 at 9:52 PM, Anishek Agarwal wrote: > Hey all, > > @Jaydeep here is the cf

Re: High Bloom filter false ratio

2016-02-18 Thread Anishek Agarwal
Hey all, @Jaydeep here is the cfstats output from one node. Read Count: 1721134722 Read Latency: 0.04268825050756254 ms. Write Count: 56743880 Write Latency: 0.014650376727851532 ms. Pending Tasks: 0 Table: user_stay_points SSTable count: 1289 Space used (live), bytes: 122141272262 Space

Re: High Bloom filter false ratio

2016-02-18 Thread Jaydeep Chovatia
How many partition keys exists for the table which shows this problem (or provide nodetool cfstats for that table)? On Thu, Feb 18, 2016 at 11:38 AM, daemeon reiydelle wrote: > The bloom filter buckets the values in a small number of buckets. I have > been surprised by how many cases I see with

Re: High Bloom filter false ratio

2016-02-18 Thread daemeon reiydelle
The bloom filter buckets the values in a small number of buckets. I have been surprised by how many cases I see with large cardinality where a few values populate a given bloom leaf, resulting in high false positives, and a surprising impact on latencies! Are you seeing 2:1 ranges between mean and

Re: High Bloom filter false ratio

2016-02-18 Thread Tyler Hobbs
You can try slightly lowering the bloom_filter_fp_chance on your table. Otherwise, it's possible that you're repeatedly querying one or two partitions that always trigger a bloom filter false positive. You could try manually tracing a few queries on this table (for non-existent partitions) to see