On Tue, Nov 2, 2010 at 1:28 AM, Daniel Doubleday
wrote:
> Hi all
>
> had some time yesterday to dig a lil deeper. And maybe this saves someone who
> made the same mistake the time so ...
>
> After trying to reproduce the problem in unit tests with the same data which
> led nowhere because every
Hi all
had some time yesterday to dig a lil deeper. And maybe this saves someone who
made the same mistake the time so ...
After trying to reproduce the problem in unit tests with the same data which
led nowhere because every single result was almost exactly what the math
promised and incident
Hi Ryan
I took a sample of one sstable (just flushed, not compacted).
I compared 2 samples of sstables. One that is showing fine false positive
ratios and the problem one.
And yes both look the same to me. Both have the expected 15 buckets per row and
the cardinality of the bitsets are the sa
Ah of course - question makes total sense.
But no: this is not the case: I am not constantly asking the same
question since the tree is deep enough. Most data nodes are level 5 from
the root. So the parents getting queried will be different most of the time.
Since the parent nodes are created
Do you have a key "a/b" then? What columns does it have?
On Wed, Oct 27, 2010 at 9:14 AM, Daniel Doubleday
wrote:
> Hm -
>
> not sure if I understand the random question. We are using RP. But I wouldn't
> know why that should matter.
> I thought that the bloom filter hash function should evenly
Hm -
not sure if I understand the random question. We are using RP. But I wouldn't
know why that should matter.
I thought that the bloom filter hash function should evenly distribute no
matter what keys come in.
Keys are '/' separated strings (aka paths :-))
I do bulk inserts like: (1000 rows
This is not expected, no. How random are your queries? If you have a
couple outlier rows causing the false positives that are being queried
over and over then that could just be the luck of the draw.
On Wed, Oct 27, 2010 at 5:24 AM, Daniel Doubleday
wrote:
> Hi people
>
> We are currently movin
Hi people
We are currently moving our second use case from mysql to cassandra. While
importing the data (ongoing) I noticed that the BloomFilterFalseRation seems to
be pretty high compared to another CF which is in used in production right now.
Its a hierarchical data model and I cannot avoid t