Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Hi Eric,

Thanks for sharing the information!

We also mainly want to use it for trimming data, either by the time or the
number of columns in a row. We haven't started the work yet, do you mind to
share some patches? We'd love to try it and test it in our environment.

Thanks.

On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens  wrote:

> We have been working on filtering compaction for a month or so (though we
> call it deleting compaction, its implementation is as a filtering
> compaction strategy).  The feature is nearing completion, and we have used
> it successfully in a limited production capacity against DSE 4.8 series.
>
> Our use case is that our records are written anywhere between a month, up
> to several years before they are scheduled for deletion.  Tombstones are
> too expensive, as we have tables with hundreds of billions of rows.  In
> addition, traditional TTLs don't work for us because our customers are
> permitted to change their retention policy such that already-written
> records should not be deleted if they increase their retention after the
> record was written (or vice versa).
>
> We can clean up data more cheaply and more quickly with filtered
> compaction than with tombstones and traditional compaction.  Our
> implementation is a wrapper compaction strategy for another underlying
> strategy, so that you can have the characteristics of whichever strategy
> makes sense in terms of managing your SSTables, while interceding and
> removing records during compaction (including cleaning up secondary
> indexes) that otherwise would have survived into the new SSTable.
>
> We are hoping to contribute it back to the community, so if you'd be
> interested in helping test it out, I'd love to hear from you.
>
> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson  wrote:
>
>> We don't have anything like that, do you have a specific use case in mind?
>>
>> Could you create a JIRA ticket and we can discuss there?
>>
>> /Marcus
>>
>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:
>>
>>> Hello there,
>>>
>>> RocksDB has the feature called "Compaction Filter" to allow application
>>> to modify/delete a key-value during the background compaction.
>>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>>>
>>> I'm wondering is there a plan/value to add this into C* as well? Or is
>>> there already a similar thing in C*?
>>>
>>> Thanks
>>>
>>> --
>>> Dikang
>>>
>>>
>>


-- 
Dikang


Re: Getting Issue while setting up Cassandra in Windows 8.1

2016-03-19 Thread Paulo Motta
You should use the u...@cassandra.apache.org list for cassandra-related
questions, not this (dev@cassandra.apache.org) which is exclusive for
internal Cassandra development. You can register to the user list by
sending an e-mail to: user-subscr...@cassandra.apache.org

Answering your question, it seems your %PATH% variable is broken, for some
reason you removed the default
"C:\Windows\System32\WindowsPowerShell\v1.0\" entry, so powershell is not
found to run properly configured version of Cassandra, so you must fix
that. If you still want to run with legacy settings (not recommended), you
must adjust your heap settings (-Xms, -Xmx) on the cassandra.bat script to
fit your system's memory.

2016-03-18 9:56 GMT-03:00 Bhupendra Baraiya :

> Hi ,
>
>
>
>Please refer attached file where in I have mentioned all steps that I
> took  for installing and configuring Cassandra
>
>
>
> When I try to start Cassandra I am getting below error please help me with
> this
>
>
>
>
>
> Thanks and regards,
>
>
>
> *Bhupendra Baraiya*
>
> Continuum Managed Services, LLC.
>
> p: 902-933-0019
>
> e: bhupendra.bara...@continuum.net
>
> w: continuum.net
>
> [image:
> http://cdn2.hubspot.net/hub/281750/file-393087232-png/img/logos/email-continuum-logo-151x26.png]
> 
>
>
>


DTCS Question

2016-03-19 Thread Anubhav Kale
Hello,

I am trying to concretely understand how DTCS makes buckets and I am looking at 
the DateTieredCompactionStrategyTest.testGetBuckets method and played with some 
of the parameters to GetBuckets method call (Cassandra 2.1.12).

I don't think I fully understand something there. Let me try to explain.

Consider the second test there. I changed the pairs a bit for easier 
explanation and changed base (initial window size)=1000L and Min_Threshold=2

pairs = Lists.newArrayList(
Pair.create("a", 200L),
Pair.create("b", 2000L),
Pair.create("c", 3600L),
Pair.create("d", 3899L),
Pair.create("e", 3900L),
Pair.create("f", 3950L),
Pair.create("too new", 4125L)
);
buckets = getBuckets(pairs, 1000L, 2, 4050L, Long.MAX_VALUE);

In this case, the buckets should look like [0-4000] [4000-]. Is this correct ? 
The buckets that I get back are different ("a" lives in its bucket and everyone 
else in another). What I am missing here ?

Another case,

pairs = Lists.newArrayList(
Pair.create("a", 200L),
Pair.create("b", 2000L),
Pair.create("c", 3600L),
Pair.create("d", 3899L),
Pair.create("e", 3900L),
Pair.create("f", 3950L),
Pair.create("too new", 4125L)
);
buckets = getBuckets(pairs, 50L, 4, 4050L, Long.MAX_VALUE);

Here, the buckets should be [0-3200] [3200-4000] [4000-4050] [4050-]. Is this 
correct ? Again, the buckets that come back are quite different.

Note, that if I keep the base to original (100L) or increase it and play with 
min_threshold the results are exactly what I would expect.

The way I think about DTCS is, try to make buckets of maximum possible sizes 
from 0, and once you can't make do that , make smaller buckets (similar to what 
the comment suggests). Is this mental model wrong ? I am afraid that the math 
in Target class is somewhat hard to follow so I am thinking about it this way.

Thanks a lot in advance.

-Anubhav


RE: DTCS Question

2016-03-19 Thread Anubhav Kale
Thanks for the long explanation, I looked at the link you pointed to and it 
does seem to concur with my mental model. 

Do you see any issues with that model and simplify this logic to 

1. Create windows from start (min) to end (max) going from maximum possible 
size.
2. Scan all SS Tables and put them in appropriate buckets.

To be honest, the Target class is really difficult to reason about. The reason 
I investigated this was we wanted to reason about how our SS Tables are 
looking, and I unfortunately can't.

Thanks again for the explanation !!

-Original Message-
From: Björn Hegerfors [mailto:bj...@spotify.com] 
Sent: Thursday, March 17, 2016 11:19 AM
To: dev@cassandra.apache.org
Subject: Re: DTCS Question

That is probably close to the actual way it works, but not quite equal. My 
mental model when making this went backwards in time, towards 0, not forwards.

It's something like this (using the numbers from your first example): make a 
bucket of the specified "timeUnit" size (1000), that contains the "now"
timestamp (4050), where the starting (and therefore also the ending) timestamp 
of the bucket is 0 modulo the size of the bucket. That last point is perhaps 
the trickiest point to follow. There is only one such place for the bucket, 
[4000-5000) in this case. No other bucket that is aligned with the 1000s can 
contain 4050.

Now, the next bucket (backwards) is computed based on this [4000-5000) bucket. 
Most of the time it will simply be the same-sized bucket right before it, i.e. 
[3000-4000), but if the start timestamp of our bucket (4000), divided by its 
size (so 4), is 0 modulo "base" (2 in this case), which it happens to be here, 
then we increase out bucket size "base" times, and instead make the bucket of 
*that* size that ends right before our current bucket. So the result will be 
[2000-4000).

This method of getting the next bucket is repeated until we reach timestamp 0. 
Using the above logic, we don't increase the size of the bucket this time, 
because we have a start timestamp of 2000 which becomes 1 when divided by the 
size (2000). So we end up with [0, 2000), and we're done.
The buckets were [4000-5000), [2000-4000) and [0-2000).

What's more important than understanding these rules is of course getting some 
kind of intuition for this. Here's what it boils down to: we want there to be 
"base" equally sized buckets right next to each other before we
*coalesce* them. Every bucket is aligned with its own size (as an analogy, 
compilers typically align 4-byte integers on addresses divisible by 4, same 
concept). So, by extension, the bigger bucket they coalesce into must be 
aligned with *its* size. Not just any "base" adjacent buckets will do, it will 
be those that align with the next size.

The remaining question is when do they coalesce? There will always be at least 
1 and at most "base" buckets of every size. Say "base"=4, then there can be 4 
bucket of some size (by necessity next to each other and aligned on 4 times 
their size). The moment a new bucket of the same size appears, the 4 buckets 
become one and this "fifth" bucket will be alone with its size (and the start 
of a new group of 4 such buckets). (The rule for making the bucket were the 
"now" timestamp lives in, is where new buckets come from).

I wish this was easier to explain in simple terms. I personally find this to 
have very nice properties, in that it gives every bucket a fair amount of time 
to settle before it's time for the next compaction.

Interestingly, I proposed an alternative algorithm in this ticket 
,
 including a patch implementing it. My gut tells me that the mental model that 
you've used here is actually equivalent to that algorithm in the ticket. It's 
just expressed in a very different way. Might be something for me to try to 
prove when I'm really bored :)

Hope this helped! Any particular reason you're investigating this?

/
Bj0rn

On Thu, Mar 17, 2016 at 5:43 PM, Anubhav Kale 
wrote:

> Hello,
>
> I am trying to concretely understand how DTCS makes buckets and I am 
> looking at the DateTieredCompactionStrategyTest.testGetBuckets method 
> and played with some of the parameters to GetBuckets method call 
> (Cassandra 2.1.12).
>
> I don't think I fully understand something there. Let me try to explain.
>
> Consider the second test there. I changed the pairs a bit for easier 
> explanation and changed base (initial window size)=1000L and 
> Min_Threshold=2
>
> pairs = Lists.newArrayList(
> Pair.create("a", 200L),
> Pair.create("b", 2000L),
> Pair.create("c", 3600L),
> Pair.create("d", 3899L),
> Pair.create("e", 3900

Re: Compaction Filter in Cassandra

2016-03-19 Thread Robert Coli
On Fri, Mar 11, 2016 at 10:05 PM, Dikang Gu  wrote:

> RocksDB has the feature called "Compaction Filter" to allow application to
> modify/delete a key-value during the background compaction.
> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226
>
> I'm wondering is there a plan/value to add this into C* as well? Or is
> there already a similar thing in C*?
>

I think it's far more reasonable to do this via an offline tool such as
"sstablefilter" :

https://issues.apache.org/jira/browse/CASSANDRA-1581

I used the internal Digg version of this to purge a bunch of obsolete keys
from a multi-tenancy CF (bad practice). It worked great.

=Rob


Re: Compaction Filter in Cassandra

2016-03-19 Thread Dikang Gu
Fyi, this is the jira, https://issues.apache.org/jira/browse/CASSANDRA-11348
.

We can move the discussion to the jira if want.

On Thu, Mar 17, 2016 at 11:46 AM, Dikang Gu  wrote:

> Hi Eric,
>
> Thanks for sharing the information!
>
> We also mainly want to use it for trimming data, either by the time or the
> number of columns in a row. We haven't started the work yet, do you mind to
> share some patches? We'd love to try it and test it in our environment.
>
> Thanks.
>
> On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens  wrote:
>
>> We have been working on filtering compaction for a month or so (though we
>> call it deleting compaction, its implementation is as a filtering
>> compaction strategy).  The feature is nearing completion, and we have used
>> it successfully in a limited production capacity against DSE 4.8 series.
>>
>> Our use case is that our records are written anywhere between a month, up
>> to several years before they are scheduled for deletion.  Tombstones are
>> too expensive, as we have tables with hundreds of billions of rows.  In
>> addition, traditional TTLs don't work for us because our customers are
>> permitted to change their retention policy such that already-written
>> records should not be deleted if they increase their retention after the
>> record was written (or vice versa).
>>
>> We can clean up data more cheaply and more quickly with filtered
>> compaction than with tombstones and traditional compaction.  Our
>> implementation is a wrapper compaction strategy for another underlying
>> strategy, so that you can have the characteristics of whichever strategy
>> makes sense in terms of managing your SSTables, while interceding and
>> removing records during compaction (including cleaning up secondary
>> indexes) that otherwise would have survived into the new SSTable.
>>
>> We are hoping to contribute it back to the community, so if you'd be
>> interested in helping test it out, I'd love to hear from you.
>>
>> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson 
>> wrote:
>>
>>> We don't have anything like that, do you have a specific use case in
>>> mind?
>>>
>>> Could you create a JIRA ticket and we can discuss there?
>>>
>>> /Marcus
>>>
>>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu  wrote:
>>>
 Hello there,

 RocksDB has the feature called "Compaction Filter" to allow application
 to modify/delete a key-value during the background compaction.
 https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226

 I'm wondering is there a plan/value to add this into C* as well? Or is
 there already a similar thing in C*?

 Thanks

 --
 Dikang


>>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: DTCS Question

2016-03-19 Thread Björn Hegerfors
That is probably close to the actual way it works, but not quite equal. My
mental model when making this went backwards in time, towards 0, not
forwards.

It's something like this (using the numbers from your first example): make
a bucket of the specified "timeUnit" size (1000), that contains the "now"
timestamp (4050), where the starting (and therefore also the ending)
timestamp of the bucket is 0 modulo the size of the bucket. That last point
is perhaps the trickiest point to follow. There is only one such place for
the bucket, [4000-5000) in this case. No other bucket that is aligned with
the 1000s can contain 4050.

Now, the next bucket (backwards) is computed based on this [4000-5000)
bucket. Most of the time it will simply be the same-sized bucket right
before it, i.e. [3000-4000), but if the start timestamp of our bucket
(4000), divided by its size (so 4), is 0 modulo "base" (2 in this case),
which it happens to be here, then we increase out bucket size "base" times,
and instead make the bucket of *that* size that ends right before our
current bucket. So the result will be [2000-4000).

This method of getting the next bucket is repeated until we reach timestamp
0. Using the above logic, we don't increase the size of the bucket this
time, because we have a start timestamp of 2000 which becomes 1 when
divided by the size (2000). So we end up with [0, 2000), and we're done.
The buckets were [4000-5000), [2000-4000) and [0-2000).

What's more important than understanding these rules is of course getting
some kind of intuition for this. Here's what it boils down to: we want
there to be "base" equally sized buckets right next to each other before we
*coalesce* them. Every bucket is aligned with its own size (as an analogy,
compilers typically align 4-byte integers on addresses divisible by 4, same
concept). So, by extension, the bigger bucket they coalesce into must be
aligned with *its* size. Not just any "base" adjacent buckets will do, it
will be those that align with the next size.

The remaining question is when do they coalesce? There will always be at
least 1 and at most "base" buckets of every size. Say "base"=4, then there
can be 4 bucket of some size (by necessity next to each other and aligned
on 4 times their size). The moment a new bucket of the same size appears,
the 4 buckets become one and this "fifth" bucket will be alone with its
size (and the start of a new group of 4 such buckets). (The rule for making
the bucket were the "now" timestamp lives in, is where new buckets come
from).

I wish this was easier to explain in simple terms. I personally find this
to have very nice properties, in that it gives every bucket a fair amount
of time to settle before it's time for the next compaction.

Interestingly, I proposed an alternative algorithm in this ticket
, including a patch
implementing it. My gut tells me that the mental model that you've used
here is actually equivalent to that algorithm in the ticket. It's just
expressed in a very different way. Might be something for me to try to
prove when I'm really bored :)

Hope this helped! Any particular reason you're investigating this?

/
Bj0rn

On Thu, Mar 17, 2016 at 5:43 PM, Anubhav Kale 
wrote:

> Hello,
>
> I am trying to concretely understand how DTCS makes buckets and I am
> looking at the DateTieredCompactionStrategyTest.testGetBuckets method and
> played with some of the parameters to GetBuckets method call (Cassandra
> 2.1.12).
>
> I don't think I fully understand something there. Let me try to explain.
>
> Consider the second test there. I changed the pairs a bit for easier
> explanation and changed base (initial window size)=1000L and Min_Threshold=2
>
> pairs = Lists.newArrayList(
> Pair.create("a", 200L),
> Pair.create("b", 2000L),
> Pair.create("c", 3600L),
> Pair.create("d", 3899L),
> Pair.create("e", 3900L),
> Pair.create("f", 3950L),
> Pair.create("too new", 4125L)
> );
> buckets = getBuckets(pairs, 1000L, 2, 4050L, Long.MAX_VALUE);
>
> In this case, the buckets should look like [0-4000] [4000-]. Is this
> correct ? The buckets that I get back are different ("a" lives in its
> bucket and everyone else in another). What I am missing here ?
>
> Another case,
>
> pairs = Lists.newArrayList(
> Pair.create("a", 200L),
> Pair.create("b", 2000L),
> Pair.create("c", 3600L),
> Pair.create("d", 3899L),
> Pair.create("e", 3900L),
> Pair.create("f", 3950L),
> Pair.create("too new", 4125L)
> );
> buckets = getBuckets(pairs, 50L, 4, 4050L, Long.MAX_VALUE);
>
> Here, the buckets should be [0-3200] [3200-4000] [4000-4050] [4050-]. Is
> this correct ? Again, the buckets that come back are quite different.
>
> Note, that if I keep the base