Re: [DISCUSS] CEP-19: Trie memtable implementation

Dinesh Joshi Sat, 05 Feb 2022 12:59:12 -0800

This is excellent. Thanks for opening up this CEP. It would be great to get 
some stats around GC allocation rate / memory pressure, read & write latencies, 
etc. compared to existing implementation.


Dinesh

> On Jan 18, 2022, at 2:13 AM, Branimir Lambov <[email protected]> wrote:
> 
> The memtable pluggability API (CEP-11) is per-table to enable memtable 
> selection that suits specific workflows. It also makes full sense to permit 
> per-node configuration, both to be able to modify the configuration to suit 
> heterogeneous deployments better, as well as to test changes for improvements 
> such as this one.
> Recognizing this, the patch comes with a modification to the API 
> <https://github.com/blambov/cassandra/commit/24b558ba2f71a2f040804e28993cc914b31298f5>
>  that defines memtable templates in cassandra.yaml (i.e. per node) and allows 
> the schema to select a template (in addition to being able to specify the 
> full memtable configuration). One could use this e.g. by adding:
> memtable_templates:
>     trie:
>         class: TrieMemtable
>         shards: 16
>     skiplist:
>         class: SkipListMemtable
> memtable:
>     template: skiplist
> (which defines two templates and specifies the default memtable 
> implementation to use) to cassandra.yaml and specifying  WITH memtable = 
> {'template' : 'trie'} in the table schema.
> 
> I intend to commit this modification with the memtable API 
> (CASSANDRA-17034/CEP-11).
> 
> Performance comparisons will be published soon.
> 
> Regards,
> Branimir
> 
> On Fri, Jan 14, 2022 at 4:15 PM Jeff Jirsa <[email protected] 
> <mailto:[email protected]>> wrote:
> Sounds like a great addition
> 
> Can you share some of the details around gc and latency improvements you’ve 
> observed with the list? 
> 
> Any specific reason the confirmation is through schema vs yaml? Presumably 
> it’s so a user can test per table, but this changes every host in a cluster, 
> so the impact of a bug/regression is much higher. 
> 
> 
>> On Jan 10, 2022, at 1:30 AM, Branimir Lambov <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> 
>> We would like to contribute our TrieMemtable to Cassandra. 
>> 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-19%3A+Trie+memtable+implementation
>>  
>> <https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-19%3A+Trie+memtable+implementation>
>> 
>> This is a new memtable solution aimed to replace the legacy implementation, 
>> developed with the following objectives:
>> - lowering the on-heap complexity and the ability to store memtable indexing 
>> structures off-heap,
>> - leveraging byte order and a trie structure to lower the memory footprint 
>> and improve mutation and lookup performance.
>> 
>> The new memtable relies on CASSANDRA-6936 to translate to and from 
>> byte-ordered representations of types, and CASSANDRA-17034 / CEP-11 to plug 
>> into Cassandra. The memtable is built on multiple shards of custom in-memory 
>> single-writer multiple-reader tries, whose implementation uses a combination 
>> of state-of-the-art and novel features for greater efficiency.
>> 
>> The CEP's JIRA ticket (https://issues.apache.org/jira/browse/CASSANDRA-17240 
>> <https://issues.apache.org/jira/browse/CASSANDRA-17240>) contains the 
>> initial version of the implementation. In its current form it achieves much 
>> better garbage collection latency, significantly bigger data sizes between 
>> flushes for the same memory allocation, as well as drastically increased 
>> write throughput, and we expect the memory and garbage collection 
>> improvements to go much further with upcoming improvements to the solution.
>> 
>> I am interested in hearing your thoughts on the proposal.
>> 
>> Regards,
>> Branimir
>>

Re: [DISCUSS] CEP-19: Trie memtable implementation

Reply via email to