[DISCUSS] Replace Sigar with OSHI (CASSANDRA-16565)

2023-12-14 Thread Claude Warren, Jr via dev
Greetings,

I have submitted a pull request[1] that replaces the unsupported Sigar
library with the maintained OSHI library.

OSHI is an MIT licensed library that provides information about the
underlying OS much like Sigar did.

The change adds a dependency on oshi-core at the following coordinates:

com.github.oshi
 oshi-core
 6.4.6

In addition to switching to a supported library, this change will reduce
the size of the package as the native Sigar libraries are removed.

Are there objections to making this switch and adding a new dependency?

[1] https://github.com/apache/cassandra/pull/2842/files
[2] https://issues.apache.org/jira/browse/CASSANDRA-16565


Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benjamin Lerer
>
> Can you share the reasons why Apache Calcite is not suitable for this case
> and why it was rejected


My understanding is that Calcite was made for two main things: to help with
optimizing SQL-like languages and to let people query different kinds of
data sources together.

We could think about using it for our needs, but there are some big
problems:

   1.

   CQL is not SQL. There are significant differences between the 2 languages
   2.

   Cassandra has its own specificities that will influence the cost model
   and the way we deal with optimizations: partitions, replication factors,
   consistency levels, LSM tree storage, ...
   3.

   Every framework comes with its own limitations and additional cost

>From my view, there are too many big differences between what Calcite does
and what we need in Cassandra. If we used Calcite, it would also mean
relying a lot on another system that everyone would have to learn and
adjust to. The problems and extra work this would bring don't seem worth
the benefits we might get


Le mer. 13 déc. 2023 à 18:06, Benjamin Lerer  a écrit :

> One thing that I did not mention is the fact that this CEP is only a high
> level proposal. There will be deeper discussions on the dev list around the
> different parts of this proposal when we reach those parts and have enough
> details to make those discussions more meaningful.
>
>
>> The maintenance and distribution of summary statistics in particular is
>> worthy of its own CEP, and it might be preferable to split it out.
>
>
> For maintaining node statistics the idea is to re-use the current
> Memtable/SSTable mechanism and relies on mergeable statistics. That will
> allow us to easily build node level statistics for a given table by merging
> all the statistics of its memtable and SSTables. For the distribution of
> these node statistics we are still exploring different options. We can come
> back with a precise proposal once we have hammered all the details.
> Is it for you a blocker for this CEP or do you just want to make sure that
> this part is discussed in deeper details before we implement it?
>
>>
>> The proposal also seems to imply we are aiming for coordinators to all
>> make the same decision for a query, which I think is challenging, and it
>> would be worth fleshing out the design here a little (perhaps just in Jira).
>
>
> The goal is that the large majority of nodes preparing a query at a given
> point in time should make the same decision and that over time all nodes
> should converge toward the same decision. This part is dependent on the
> node statistics distribution, the cost model and the triggers for
> re-optimization (that will require some experimentation).
>
> There’s also not much discussion of the execution model: I think it would
>> make most sense for this to be independent of any cost and optimiser models
>> (though they might want to operate on them), so that EXPLAIN and hints can
>> work across optimisers (a suitable hint might essentially bypass the
>> optimiser, if the optimiser permits it, by providing a standard execution
>> model)
>>
>
> It is not clear to me what you mean by "a standard execution model"?
> Otherwise, we were not planning to have the execution model or the hints
> depending on the optimizer.
>
> I think it would be worth considering providing the execution plan to the
>> client as part of query preparation, as an opaque payload to supply to
>> coordinators on first contact, as this might simplify the problem of
>> ensuring queries behave the same without adopting a lot of complexity for
>> synchronising statistics (which will never provide strong guarantees). Of
>> course, re-preparing a query might lead to a new plan, though any
>> coordinators with the query in their cache should be able to retrieve it
>> cheaply. If the execution model is efficiently serialised this might have
>> the ancillary benefit of improving the occupancy of our prepared query
>> cache.
>>
>
> I am not sure that I understand your proposal. If 2 nodes build a
> different execution plan how do you solve that conflict?
>
> Le mer. 13 déc. 2023 à 09:55, Benedict  a écrit :
>
>> A CBO can only make worse decisions than the status quo for what I
>> presume are the majority of queries - i.e. those that touch only primary
>> indexes. In general, there are plenty of use cases that prefer determinism.
>> So I agree that there should at least be a CBO implementation that makes
>> the same decisions as the status quo, deterministically.
>>
>>
>> I do support the proposal, but would like to see some elements discussed
>> in more detail. The maintenance and distribution of summary statistics in
>> particular is worthy of its own CEP, and it might be preferable to split it
>> out. The proposal also seems to imply we are aiming for coordinators to all
>> make the same decision for a query, which I think is challenging, and it
>> would be worth fleshing out the design here a little (perhaps just in Jira).
>>
>

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-12-14 Thread Claude Warren
Is there still interest in this?  Can we get some points down on electrons so 
that we all understand the issues?

While it is fairly simple to redirect the read/write to something other  than 
the local system for a single node this will not solve the problem for tiered 
storage.

Tiered storage will require that on read/write the primary key be assessed and 
determine if the read/write should be redirected.  My reasoning for this 
statement is that in a cluster with a replication factor greater than 1 the 
node will store data for the keys that would be allocated to it in a cluster 
with a replication factor = 1, as well as some keys from nodes earlier in the 
ring.

Even if we can get the primary keys for all the data we want to write to "cold 
storage" to map to a single node a replication factor > 1 means that data will 
also be placed in "normal storage" on subsequent nodes.

To overcome this, we have to explore ways to route data to different storage 
based on the keys and that different storage may have to be available on _all_  
the nodes.

Have any of the partial solutions mentioned in this email chain (or others) 
solved this problem?

Claude


Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benjamin Lerer
>
> I mean that an important part of this work - not specified in the CEP
> (AFAICT) - should probably be to define some standard execution model, that
> we can manipulate and serialise, for use across (and without) optimisers.


I am confused because for me an execution model defines how operations are
executed within the database in a conceptual way, which is not something
that this CEP intends to change. Do you mean the physical/execution plan?
Today this plan is somehow represented for reads by the SelectStatement and
its components (Selections, StatementRestrictions, ...) it is then
converted at execution time after parameter binding into a ReadCommand
which is sent to the replicas.
We plan to refactor SelectStatement and its components but the ReadCommands
change should be relatively small. What you are proposing is not part of
the scope of this CEP.

Le jeu. 14 déc. 2023 à 10:24, Benjamin Lerer  a écrit :

> Can you share the reasons why Apache Calcite is not suitable for this case
>> and why it was rejected
>
>
> My understanding is that Calcite was made for two main things: to help
> with optimizing SQL-like languages and to let people query different kinds
> of data sources together.
>
> We could think about using it for our needs, but there are some big
> problems:
>
>1.
>
>CQL is not SQL. There are significant differences between the 2
>languages
>2.
>
>Cassandra has its own specificities that will influence the cost model
>and the way we deal with optimizations: partitions, replication factors,
>consistency levels, LSM tree storage, ...
>3.
>
>Every framework comes with its own limitations and additional cost
>
> From my view, there are too many big differences between what Calcite does
> and what we need in Cassandra. If we used Calcite, it would also mean
> relying a lot on another system that everyone would have to learn and
> adjust to. The problems and extra work this would bring don't seem worth
> the benefits we might get
>
>
> Le mer. 13 déc. 2023 à 18:06, Benjamin Lerer  a écrit :
>
>> One thing that I did not mention is the fact that this CEP is only a high
>> level proposal. There will be deeper discussions on the dev list around the
>> different parts of this proposal when we reach those parts and have enough
>> details to make those discussions more meaningful.
>>
>>
>>> The maintenance and distribution of summary statistics in particular is
>>> worthy of its own CEP, and it might be preferable to split it out.
>>
>>
>> For maintaining node statistics the idea is to re-use the current
>> Memtable/SSTable mechanism and relies on mergeable statistics. That will
>> allow us to easily build node level statistics for a given table by merging
>> all the statistics of its memtable and SSTables. For the distribution of
>> these node statistics we are still exploring different options. We can come
>> back with a precise proposal once we have hammered all the details.
>> Is it for you a blocker for this CEP or do you just want to make sure
>> that this part is discussed in deeper details before we implement it?
>>
>>>
>>> The proposal also seems to imply we are aiming for coordinators to all
>>> make the same decision for a query, which I think is challenging, and it
>>> would be worth fleshing out the design here a little (perhaps just in Jira).
>>
>>
>> The goal is that the large majority of nodes preparing a query at a given
>> point in time should make the same decision and that over time all nodes
>> should converge toward the same decision. This part is dependent on the
>> node statistics distribution, the cost model and the triggers for
>> re-optimization (that will require some experimentation).
>>
>> There’s also not much discussion of the execution model: I think it would
>>> make most sense for this to be independent of any cost and optimiser models
>>> (though they might want to operate on them), so that EXPLAIN and hints can
>>> work across optimisers (a suitable hint might essentially bypass the
>>> optimiser, if the optimiser permits it, by providing a standard execution
>>> model)
>>>
>>
>> It is not clear to me what you mean by "a standard execution model"?
>> Otherwise, we were not planning to have the execution model or the hints
>> depending on the optimizer.
>>
>> I think it would be worth considering providing the execution plan to the
>>> client as part of query preparation, as an opaque payload to supply to
>>> coordinators on first contact, as this might simplify the problem of
>>> ensuring queries behave the same without adopting a lot of complexity for
>>> synchronising statistics (which will never provide strong guarantees). Of
>>> course, re-preparing a query might lead to a new plan, though any
>>> coordinators with the query in their cache should be able to retrieve it
>>> cheaply. If the execution model is efficiently serialised this might have
>>> the ancillary benefit of improving the occupancy of our prepared query
>>> c

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benedict
There surely needs to be a more succinct and abstract representation in order to perform transformations on the query plan? You don’t intend to manipulate the object graph directly as you apply any transformations when performing simplification or cost based analysis? This would also (I expect) be the form used to support EXPLAIN functionality, and probably also HINTs etc. This would ideally not be coupled to the CBO itself, and would ideally be succinctly serialised.I would very much expect the query plan to be represented abstractly as part of this work, and for there to be a mechanism that translates this abstract representation into the object graph that executes it.If I’m incorrect, could you please elaborate more specifically how you intend to go about this?On 14 Dec 2023, at 10:33, Benjamin Lerer  wrote:
I mean that an important part of this work - not specified in the 
CEP (AFAICT) - should probably be to define some standard execution 
model, that we can manipulate and serialise, for use across (and 
without) optimisers.I am confused because for me an execution model
 defines how operations are executed within the database in a conceptual way, which is not something that this CEP intends to change. Do you mean the physical/execution plan?Today this plan is somehow represented for reads by the SelectStatement and its components (Selections, StatementRestrictions, ...) it is then converted at execution time after parameter binding into a ReadCommand which is sent to the replicas.We plan to refactor SelectStatement and its components but the ReadCommands change should be relatively small. What you are proposing is not part of the scope of this CEP. 

Le jeu. 14 déc. 2023 à 10:24, Benjamin Lerer  a écrit :

Can you share the reasons why Apache Calcite is not suitable for this case and why it was rejected


My understanding is that Calcite was made for two main things: to help with optimizing SQL-like languages and to let people query different kinds of data sources together.We could think about using it for our needs, but there are some big problems:
CQL is not SQL. There are significant differences between the 2 languages
Cassandra has its own specificities that will influence the cost model 
and the way we deal with optimizations: partitions, replication factors,
 consistency levels, LSM tree storage, ...


Every framework comes with its own limitations and additional cost

From my view, there are too many big differences between what Calcite does and what we need in Cassandra. If we used Calcite, it would also mean relying a lot on another system that everyone would have to learn and adjust to. The problems and extra work this would bring don't seem worth the benefits we might get

   Le mer. 13 déc. 2023 à 18:06, Benjamin Lerer  a écrit :
One thing that I did not mention is the fact that this CEP is only a high level proposal. There will be deeper discussions on the dev list around the different parts of this proposal when we reach those parts and have enough details to make those discussions more meaningful.       The maintenance and distribution of summary statistics in 
particular is worthy of its own CEP, and it might be preferable to split
 it out. For maintaining node statistics the idea is to re-use the current Memtable/SSTable mechanism and relies on mergeable statistics. That will allow us to easily build node level statistics for a given table by merging all the statistics of its memtable and SSTables. For the distribution of these node statistics we are still exploring different options. We can come back with a precise proposal once we have hammered all the details. Is it for you a blocker for this CEP or do you just want to make sure that this part is discussed in deeper details before we implement it?   
 The proposal also seems to imply we are aiming for coordinators
 to all make the same decision for a query, which I think is 
challenging, and it would be worth fleshing out the design here a little
 (perhaps just in Jira).

The goal is that the large majority of nodes preparing a query at a given point in time should make the same decision and that over time 
all nodes should converge toward the same decision. This part is dependent on the node statistics distribution, the cost model and the triggers for re-optimization (that will require some experimentation). 
There’s also not much discussion of the execution model: I think 
it would make most sense for this to be independent of any cost and 
optimiser models (though they might want to operate on them), so that 
EXPLAIN and hints can work across optimisers (a suitable hint might 
essentially bypass the optimiser, if the optimiser permits it, by 
providing a standard execution model)

It is not clear to me what you mean by "a standard execution model"? Otherwise, we were not planning to have the execution model or the hints depending on the optimizer.  
I think it would be worth considering pr

Re: [DISCUSS] Replace Sigar with OSHI (CASSANDRA-16565)

2023-12-14 Thread Miklosovic, Stefan via dev
For completeness, there is this thread (1) where we already decided that sigar 
is OK to be removed completely.

I think that OSHI is way better lib to have, I am +1 on this proposal.

Currently the deal seems to be that this will go just to trunk.

(1) https://lists.apache.org/thread/6gzyh1zhxnkz50lld7hlgq172xc0pg3t


From: Claude Warren, Jr via dev 
Sent: Thursday, December 14, 2023 0:17
To: dev
Cc: Claude Warren, Jr
Subject: [DISCUSS] Replace Sigar with OSHI (CASSANDRA-16565)

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments



Greetings,

I have submitted a pull request[1] that replaces the unsupported Sigar library 
with the maintained OSHI library.

OSHI is an MIT licensed library that provides information about the underlying 
OS much like Sigar did.

The change adds a dependency on oshi-core at the following coordinates:

com.github.oshi
 oshi-core
 6.4.6

In addition to switching to a supported library, this change will reduce the 
size of the package as the native Sigar libraries are removed.

Are there objections to making this switch and adding a new dependency?

[1] 
https://github.com/apache/cassandra/pull/2842/files
[2] 
https://issues.apache.org/jira/browse/CASSANDRA-16565


Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benjamin Lerer
The binding of the parser output to the schema (what is today the
Raw.prepare call) will create the logical plan, expressed as a tree of
relational operators. Simplification and normalization will happen on that
tree to produce a new equivalent logical plan. That logical plan will be
used as input to the optimizer. The output will be a physical plan
producing the output specified by the logical plan. A tree of physical
operators specifying how the operations should be performed.

That physical plan will be stored as part of the statements
(SelectStatement, ModificationStatement, ...) in the prepared statement
cache. Upon execution, variables will be bound and the
RangeCommands/Mutations will be created based on the physical plan.

The string representation of a physical plan will effectively represent the
output of an EXPLAIN statement but outside of that the physical plan will
stay encapsulated within the statement classes.
Hints will be parameters provided to the optimizer to enforce some specific
choices. Like always using an Index Scan instead of a Table Scan, ignoring
the cost comparison.

So yes, this physical plan is the structure that you have in mind but the
idea of sharing it is not part of the CEP. I did not document it because it
will simply be a tree of physical operators used internally.

My proposal is that the execution plan of the coordinator that prepares a
> query gets serialised to the client, which then provides the execution plan
> to all future coordinators, and coordinators provide it to replicas as
> necessary.
>
>
> This means it is not possible for any conflict to arise for a single
> client. It would guarantee consistency of execution for any single client
> (and avoid any drift over the client’s sessions), without necessarily
> guaranteeing consistency for all clients.
>

 It seems that there is a difference between the goal of your proposal and
the one of the CEP. The goal of the CEP is first to ensure optimal
performance. It is ok to change the execution plan for one that delivers
better performance. What we want to minimize is having a node performing
queries in an inefficient way for a long period of time.

The client side proposal targets consistency for a given query on a given
driver instance. In practice, it would be possible to have 2 similar
queries with 2 different execution plans on the same driver making things
really confusing. Identifying the source of an inefficient query will also
be pretty hard.

Interestingly, having 2 nodes with 2 different execution plans might not be
a serious problem. It simply means that based on cardinality at t1, the
optimizer on node 1 chose plan 1 while the one on node 2 chose plan 2 at
t2. In practice if the cost estimates reflect properly the actual cost
those 2 plans should have pretty similar efficiency. The problem is more
about the fact that you would ideally want a uniform behavior around your
cluster.
Changes of execution plans should only occur at certain points. So the main
problematic scenario is when the data distribution is around one of those
points. Which is also the point where the change should have the least
impact.


Le jeu. 14 déc. 2023 à 11:38, Benedict  a écrit :

> There surely needs to be a more succinct and abstract representation in
> order to perform transformations on the query plan? You don’t intend to
> manipulate the object graph directly as you apply any transformations when
> performing simplification or cost based analysis? This would also (I
> expect) be the form used to support EXPLAIN functionality, and probably
> also HINTs etc. This would ideally *not* be coupled to the CBO itself,
> and would ideally be succinctly serialised.
>
> I would very much expect the query plan to be represented abstractly as
> part of this work, and for there to be a mechanism that translates this
> abstract representation into the object graph that executes it.
>
> If I’m incorrect, could you please elaborate more specifically how you
> intend to go about this?
>
> On 14 Dec 2023, at 10:33, Benjamin Lerer  wrote:
>
> 
>
>> I mean that an important part of this work - not specified in the CEP
>> (AFAICT) - should probably be to define some standard execution model, that
>> we can manipulate and serialise, for use across (and without) optimisers.
>
>
> I am confused because for me an execution model defines how operations are
> executed within the database in a conceptual way, which is not something
> that this CEP intends to change. Do you mean the physical/execution plan?
> Today this plan is somehow represented for reads by the SelectStatement
> and its components (Selections, StatementRestrictions, ...) it is then
> converted at execution time after parameter binding into a ReadCommand
> which is sent to the replicas.
> We plan to refactor SelectStatement and its components but the
> ReadCommands change should be relatively small. What you are proposing is
> not part of the scope of this CEP.
>
> Le jeu. 14 déc. 20

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benedict
> So yes, this physical plan is the structure that you have in mind but the idea of sharing it is not part of the CEP.

I think it should be. This should form a major part of the API on which any CBO is built.

> It seems that there is a difference between the goal of your proposal and the one of the CEP. The goal of the CEP is first to ensure optimal performance. It is ok to change the execution plan for one that delivers better performance. What we want to minimize is having a node performing queries in an inefficient way for a long period of time.

You have made a goal of the CEP synchronising summary statistics across the whole cluster in order to achieve some degree of uniformity of query plan. So this is explicitly a goal of the CEP, and synchronising summary statistics is a hard problem and won’t provide strong guarantees.

> The client side proposal targets consistency for a given query on a given driver instance. In practice, it would be possible to have 2 similar queries with 2 different execution plans on the same driver

This would only be possible if the driver permitted it. A driver could (and should) enforce that it only permits one query plan per query.

The opposite is true for your proposal: some queries may begin degrading because they touch specific replicas that optimise the query differently, and this will be hard to debug.On 14 Dec 2023, at 15:30, Benjamin Lerer  wrote:The binding of the parser output to the schema (what is today the Raw.prepare call) will create the logical plan, expressed as a tree of relational operators. Simplification and normalization will happen on that tree to produce a new equivalent logical plan. That logical plan will be used as input to the optimizer. The output will be a physical plan 
producing the output specified by the logical plan. A tree of physical operators specifying how the operations should be performed.That physical plan will be stored as part of the statements (SelectStatement, ModificationStatement, ...) in the prepared statement cache. Upon execution, variables will be bound and the RangeCommands/Mutations will be created based on the physical plan.The string representation of a physical plan will effectively represent the output of an EXPLAIN statement but outside of that the physical plan will stay encapsulated within the statement classes.    Hints will be parameters provided to the optimizer to enforce some specific choices. Like always using an Index Scan instead of a Table Scan, ignoring the cost comparison.So yes, this physical plan is the structure that you have in mind but the idea of sharing it is not part of the CEP. I did not document it because it will simply be a tree of physical operators used internally.
My
 proposal is that the execution plan of the coordinator that prepares a 
query gets serialised to the client, which then provides the execution 
plan to all future coordinators, and coordinators provide it to replicas
 as necessary. This
 means it is not possible for any conflict to arise for a single client.
 It would guarantee consistency of execution for any single client (and 
avoid any drift over the client’s sessions), without necessarily 
guaranteeing consistency for all clients.

 It seems that there is a difference between the goal of your proposal and the one of the CEP. The goal of the CEP is first to ensure optimal performance. It is ok to change the execution plan for one that delivers better performance. What we want to minimize is having a node performing queries in an inefficient way for a long period of time.The client side proposal targets consistency for a given query on a given driver instance. In practice, it would be possible to have 2 similar queries with 2 different execution plans on the same driver making things really confusing. Identifying the source of an inefficient query will also be pretty hard.Interestingly, having 2 nodes with 2 different execution plans might not be a serious problem. It simply means that based on cardinality at t1, the optimizer on node 1 chose plan 1 while the one on node 2 chose plan 2 at t2. In practice if the cost estimates reflect properly the actual cost those 2 plans should have pretty similar efficiency. The problem is more about the fact that you would ideally want a uniform behavior around your cluster.Changes of execution plans should only occur at certain points. So the main problematic scenario is when the data distribution is around one of those points. Which is also the point where the change should have the least impact.Le jeu. 14 déc. 2023 à 11:38, Benedict  a écrit :There surely needs to be a more succinct and abstract representation in order to perform transformations on the query plan? You don’t intend to manipulate the object graph directly as you apply any transformations when performing simplification or cost based analysis? This would also (I expect) be the form used to support EXPLAIN functionality, and probably also HINTs

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benedict
> I think it should be. This should part of the API on which any CBO is built.To expand on this a bit: one of the stated goals of the CEP is to support multiple CBOs, and this is a required component of any CBO. If this doesn’t form part of the shared machinery, we aren’t really enabling new CBOs, we’re just refactoring the codebase to implement the CBO intended by this CEP.This would also mean that all of the additional machinery, like EXPLAIN, HINT etc would need to be implemented for each CBO independently. This is a high hurdle for any new CBO, a lot of wasted work and would lead to a less consistent experience for the user, as each CBO would do this differently. These facilities make sense to build as a shared feature on top of a single execution model - and I think this is true regardless of your stance on my suggestion for managing query execution.If you disagree, it would help to understand how you expect these facilities to be built in the ecosystem of CBOs envisaged by the CEP, and how we would maintain a consistency of user experience.On 14 Dec 2023, at 15:37, Benedict  wrote:> So yes, this physical plan is the structure that you have in mind but the idea of sharing it is not part of the CEP.

I think it should be. This should form a major part of the API on which any CBO is built.

> It seems that there is a difference between the goal of your proposal and the one of the CEP. The goal of the CEP is first to ensure optimal performance. It is ok to change the execution plan for one that delivers better performance. What we want to minimize is having a node performing queries in an inefficient way for a long period of time.

You have made a goal of the CEP synchronising summary statistics across the whole cluster in order to achieve some degree of uniformity of query plan. So this is explicitly a goal of the CEP, and synchronising summary statistics is a hard problem and won’t provide strong guarantees.

> The client side proposal targets consistency for a given query on a given driver instance. In practice, it would be possible to have 2 similar queries with 2 different execution plans on the same driver

This would only be possible if the driver permitted it. A driver could (and should) enforce that it only permits one query plan per query.

The opposite is true for your proposal: some queries may begin degrading because they touch specific replicas that optimise the query differently, and this will be hard to debug.On 14 Dec 2023, at 15:30, Benjamin Lerer  wrote:The binding of the parser output to the schema (what is today the Raw.prepare call) will create the logical plan, expressed as a tree of relational operators. Simplification and normalization will happen on that tree to produce a new equivalent logical plan. That logical plan will be used as input to the optimizer. The output will be a physical plan 
producing the output specified by the logical plan. A tree of physical operators specifying how the operations should be performed.That physical plan will be stored as part of the statements (SelectStatement, ModificationStatement, ...) in the prepared statement cache. Upon execution, variables will be bound and the RangeCommands/Mutations will be created based on the physical plan.The string representation of a physical plan will effectively represent the output of an EXPLAIN statement but outside of that the physical plan will stay encapsulated within the statement classes.    Hints will be parameters provided to the optimizer to enforce some specific choices. Like always using an Index Scan instead of a Table Scan, ignoring the cost comparison.So yes, this physical plan is the structure that you have in mind but the idea of sharing it is not part of the CEP. I did not document it because it will simply be a tree of physical operators used internally.
My
 proposal is that the execution plan of the coordinator that prepares a 
query gets serialised to the client, which then provides the execution 
plan to all future coordinators, and coordinators provide it to replicas
 as necessary. This
 means it is not possible for any conflict to arise for a single client.
 It would guarantee consistency of execution for any single client (and 
avoid any drift over the client’s sessions), without necessarily 
guaranteeing consistency for all clients.

 It seems that there is a difference between the goal of your proposal and the one of the CEP. The goal of the CEP is first to ensure optimal performance. It is ok to change the execution plan for one that delivers better performance. What we want to minimize is having a node performing queries in an inefficient way for a long period of time.The client side proposal targets consistency for a given query on a given driver instance. In practice, it would be possible to have 2 similar queries with 2 different execution plans on the same driver making things really confusing. Identifying the source of an inefficient query will also be pretty hard.Interestin

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Chris Lohfink
I don't wanna be a blocker for this CEP or anything but did want to put my
2 cents in. This CEP is horrifying to me.

I have seen thousands of clusters across multiple companies and helped them
get working successfully. A vast majority of that involved blocking the use
of MVs, GROUP BY, secondary indexes, and even just simple _range queries_.
The "unncessary restrictions of cql" are not only necessary IMHO, more
restrictions are necessary to be successful at scale. The idea of just
opening up CQL to general purpose relational queries and lines like "supporting
queries with joins in an efficient way" ... I would really like us to make
secondary indexes be a viable option before we start opening up floodgates
on stuff like this.

Chris

On Thu, Dec 14, 2023 at 9:37 AM Benedict  wrote:

> > So yes, this physical plan is the structure that you have in mind but
> the idea of sharing it is not part of the CEP.
>
>
> I think it should be. This should form a major part of the API on which
> any CBO is built.
>
>
> > It seems that there is a difference between the goal of your proposal
> and the one of the CEP. The goal of the CEP is first to ensure optimal
> performance. It is ok to change the execution plan for one that delivers
> better performance. What we want to minimize is having a node performing
> queries in an inefficient way for a long period of time.
>
>
> You have made a goal of the CEP synchronising summary statistics across
> the whole cluster in order to achieve some degree of uniformity of query
> plan. So this is explicitly a goal of the CEP, and synchronising summary
> statistics is a hard problem and won’t provide strong guarantees.
>
>
> > The client side proposal targets consistency for a given query on a
> given driver instance. In practice, it would be possible to have 2 similar
> queries with 2 different execution plans on the same driver
>
>
> This would only be possible if the driver permitted it. A driver could
> (and should) enforce that it only permits one query plan per query.
>
>
> The opposite is true for your proposal: some queries may begin degrading
> because they touch specific replicas that optimise the query differently,
> and this will be hard to debug.
>
>
> On 14 Dec 2023, at 15:30, Benjamin Lerer  wrote:
>
> 
> The binding of the parser output to the schema (what is today the
> Raw.prepare call) will create the logical plan, expressed as a tree of
> relational operators. Simplification and normalization will happen on that
> tree to produce a new equivalent logical plan. That logical plan will be
> used as input to the optimizer. The output will be a physical plan
> producing the output specified by the logical plan. A tree of physical
> operators specifying how the operations should be performed.
>
> That physical plan will be stored as part of the statements
> (SelectStatement, ModificationStatement, ...) in the prepared statement
> cache. Upon execution, variables will be bound and the
> RangeCommands/Mutations will be created based on the physical plan.
>
> The string representation of a physical plan will effectively represent
> the output of an EXPLAIN statement but outside of that the physical plan
> will stay encapsulated within the statement classes.
> Hints will be parameters provided to the optimizer to enforce some
> specific choices. Like always using an Index Scan instead of a Table Scan,
> ignoring the cost comparison.
>
> So yes, this physical plan is the structure that you have in mind but the
> idea of sharing it is not part of the CEP. I did not document it because it
> will simply be a tree of physical operators used internally.
>
> My proposal is that the execution plan of the coordinator that prepares a
>> query gets serialised to the client, which then provides the execution plan
>> to all future coordinators, and coordinators provide it to replicas as
>> necessary.
>>
>>
>> This means it is not possible for any conflict to arise for a single
>> client. It would guarantee consistency of execution for any single client
>> (and avoid any drift over the client’s sessions), without necessarily
>> guaranteeing consistency for all clients.
>>
>
>  It seems that there is a difference between the goal of your proposal and
> the one of the CEP. The goal of the CEP is first to ensure optimal
> performance. It is ok to change the execution plan for one that delivers
> better performance. What we want to minimize is having a node performing
> queries in an inefficient way for a long period of time.
>
> The client side proposal targets consistency for a given query on a given
> driver instance. In practice, it would be possible to have 2 similar
> queries with 2 different execution plans on the same driver making things
> really confusing. Identifying the source of an inefficient query will also
> be pretty hard.
>
> Interestingly, having 2 nodes with 2 different execution plans might not
> be a serious problem. It simply means that based on cardinality at t1, t

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benedict
Fwiw Chris, I agree with your concerns, but I think the introduction of a CBO - done right - is in principle a good thing in its own right. It’s independent of the issues you mention, even if it might enable features that exacerbate them.It should also help enable secondary indexes work better, which is I think their main justification. Otherwise there isn’t really a good reason for it today. Though I personally anticipate ongoing issues around 2i that SAI is perhaps over exuberantly sold as solving, and that a CBO will not fix. But we’ll see how that evolves.On 14 Dec 2023, at 15:49, Chris Lohfink  wrote:I don't wanna be a blocker for this CEP or anything but did want to put my 2 cents in. This CEP is horrifying to me.I have seen thousands of clusters across multiple companies and helped them get working successfully. A vast majority of that involved blocking the use of MVs, GROUP BY, secondary indexes, and even just simple _range queries_. The "unncessary restrictions of cql" are not only necessary IMHO, more restrictions are necessary to be successful at scale. The idea of just opening up CQL to general purpose relational queries and lines like "supporting queries with joins in an efficient way" ... I would really like us to make secondary indexes be a viable option before we start opening up floodgates on stuff like this.ChrisOn Thu, Dec 14, 2023 at 9:37 AM Benedict  wrote:> So yes, this physical plan is the structure that you have in mind but the idea of sharing it is not part of the CEP.

I think it should be. This should form a major part of the API on which any CBO is built.

> It seems that there is a difference between the goal of your proposal and the one of the CEP. The goal of the CEP is first to ensure optimal performance. It is ok to change the execution plan for one that delivers better performance. What we want to minimize is having a node performing queries in an inefficient way for a long period of time.

You have made a goal of the CEP synchronising summary statistics across the whole cluster in order to achieve some degree of uniformity of query plan. So this is explicitly a goal of the CEP, and synchronising summary statistics is a hard problem and won’t provide strong guarantees.

> The client side proposal targets consistency for a given query on a given driver instance. In practice, it would be possible to have 2 similar queries with 2 different execution plans on the same driver

This would only be possible if the driver permitted it. A driver could (and should) enforce that it only permits one query plan per query.

The opposite is true for your proposal: some queries may begin degrading because they touch specific replicas that optimise the query differently, and this will be hard to debug.On 14 Dec 2023, at 15:30, Benjamin Lerer  wrote:The binding of the parser output to the schema (what is today the Raw.prepare call) will create the logical plan, expressed as a tree of relational operators. Simplification and normalization will happen on that tree to produce a new equivalent logical plan. That logical plan will be used as input to the optimizer. The output will be a physical plan 
producing the output specified by the logical plan. A tree of physical operators specifying how the operations should be performed.That physical plan will be stored as part of the statements (SelectStatement, ModificationStatement, ...) in the prepared statement cache. Upon execution, variables will be bound and the RangeCommands/Mutations will be created based on the physical plan.The string representation of a physical plan will effectively represent the output of an EXPLAIN statement but outside of that the physical plan will stay encapsulated within the statement classes.    Hints will be parameters provided to the optimizer to enforce some specific choices. Like always using an Index Scan instead of a Table Scan, ignoring the cost comparison.So yes, this physical plan is the structure that you have in mind but the idea of sharing it is not part of the CEP. I did not document it because it will simply be a tree of physical operators used internally.
My
 proposal is that the execution plan of the coordinator that prepares a 
query gets serialised to the client, which then provides the execution 
plan to all future coordinators, and coordinators provide it to replicas
 as necessary. This
 means it is not possible for any conflict to arise for a single client.
 It would guarantee consistency of execution for any single client (and 
avoid any drift over the client’s sessions), without necessarily 
guaranteeing consistency for all clients.

 It seems that there is a difference between the goal of your proposal and the one of the CEP. The goal of the CEP is first to ensure optimal performance. It is ok to change the execution plan for one that delivers better performance. What we want to minimize is having a node performing queries in an inefficient way for a l

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Benjamin Lerer
>
>   So yes, this physical plan is the structure that you have in mind but
> the idea of sharing it is not part of the CEP.


Sorry, Benedict, what I meant by sharing was sharing across the nodes. It
is an integral part of the optimizer API that the CEP talks about as it
represents its output. I did not realize that this part was not clear.

Le jeu. 14 déc. 2023 à 17:15, Benedict  a écrit :

> Fwiw Chris, I agree with your concerns, but I think the introduction of a
> CBO - done right - is in principle a good thing in its own right. It’s
> independent of the issues you mention, even if it might enable features
> that exacerbate them.
>
> It should also help enable secondary indexes work better, which is I think
> their main justification. Otherwise there isn’t really a good reason for it
> today. Though I personally anticipate ongoing issues around 2i that SAI is
> perhaps over exuberantly sold as solving, and that a CBO will not fix. But
> we’ll see how that evolves.
>
>
>
> On 14 Dec 2023, at 15:49, Chris Lohfink  wrote:
>
> 
> I don't wanna be a blocker for this CEP or anything but did want to put my
> 2 cents in. This CEP is horrifying to me.
>
> I have seen thousands of clusters across multiple companies and helped
> them get working successfully. A vast majority of that involved blocking
> the use of MVs, GROUP BY, secondary indexes, and even just simple _range
> queries_. The "unncessary restrictions of cql" are not only necessary IMHO,
> more restrictions are necessary to be successful at scale. The idea of just
> opening up CQL to general purpose relational queries and lines like 
> "supporting
> queries with joins in an efficient way" ... I would really like us to
> make secondary indexes be a viable option before we start opening up
> floodgates on stuff like this.
>
> Chris
>
> On Thu, Dec 14, 2023 at 9:37 AM Benedict  wrote:
>
>> > So yes, this physical plan is the structure that you have in mind but
>> the idea of sharing it is not part of the CEP.
>>
>>
>> I think it should be. This should form a major part of the API on which
>> any CBO is built.
>>
>>
>> > It seems that there is a difference between the goal of your proposal
>> and the one of the CEP. The goal of the CEP is first to ensure optimal
>> performance. It is ok to change the execution plan for one that delivers
>> better performance. What we want to minimize is having a node performing
>> queries in an inefficient way for a long period of time.
>>
>>
>> You have made a goal of the CEP synchronising summary statistics across
>> the whole cluster in order to achieve some degree of uniformity of query
>> plan. So this is explicitly a goal of the CEP, and synchronising summary
>> statistics is a hard problem and won’t provide strong guarantees.
>>
>>
>> > The client side proposal targets consistency for a given query on a
>> given driver instance. In practice, it would be possible to have 2 similar
>> queries with 2 different execution plans on the same driver
>>
>>
>> This would only be possible if the driver permitted it. A driver could
>> (and should) enforce that it only permits one query plan per query.
>>
>>
>> The opposite is true for your proposal: some queries may begin degrading
>> because they touch specific replicas that optimise the query differently,
>> and this will be hard to debug.
>>
>>
>> On 14 Dec 2023, at 15:30, Benjamin Lerer  wrote:
>>
>> 
>> The binding of the parser output to the schema (what is today the
>> Raw.prepare call) will create the logical plan, expressed as a tree of
>> relational operators. Simplification and normalization will happen on that
>> tree to produce a new equivalent logical plan. That logical plan will be
>> used as input to the optimizer. The output will be a physical plan
>> producing the output specified by the logical plan. A tree of physical
>> operators specifying how the operations should be performed.
>>
>> That physical plan will be stored as part of the statements
>> (SelectStatement, ModificationStatement, ...) in the prepared statement
>> cache. Upon execution, variables will be bound and the
>> RangeCommands/Mutations will be created based on the physical plan.
>>
>> The string representation of a physical plan will effectively represent
>> the output of an EXPLAIN statement but outside of that the physical plan
>> will stay encapsulated within the statement classes.
>> Hints will be parameters provided to the optimizer to enforce some
>> specific choices. Like always using an Index Scan instead of a Table Scan,
>> ignoring the cost comparison.
>>
>> So yes, this physical plan is the structure that you have in mind but the
>> idea of sharing it is not part of the CEP. I did not document it because it
>> will simply be a tree of physical operators used internally.
>>
>> My proposal is that the execution plan of the coordinator that prepares a
>>> query gets serialised to the client, which then provides the execution plan
>>> to all future coordinators, and coordinators 

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-14 Thread Jeff Jirsa
I'm also torn on the CEP as presented. I think some of it is my negative
emotional response to the examples - e.g. I've literally never seen a real
use case where unfolding constants matters, and I'm trying to convince
myself to read past that.

I also cant tell what exactly you mean when you say "In order to ensure
that the execution plans on each node are the same, the cardinality
estimator should provide the same global statistics on every node as well
as some notification mechanism that can be used to trigger
re-optimization." In my experience, you'll see variable cost on each host,
where a machine that went offline temporarily got a spike in sstables from
repair and has a compaction backlog, causing a higher cost per read on that
host due to extra sstables/duplicate rows/merges. Is the cost based
optimizer in your model going to understand the different cost per replica
and also use that in choosing the appropriate replicas to query?

Finally: ALLOW FILTERING should not be deprecated. It doesn't matter if the
CBO may be able to help improve queries that have filtering. That guard
exists because most people who are new to cassandra don't understand the
difference and it prevents far more self-inflicted failures than anyone can
count. Please do not remove this. You will instantly create a world where
most new users to the database tip over as soon as their adoption picks up.



On Thu, Dec 14, 2023 at 7:49 AM Chris Lohfink  wrote:

> I don't wanna be a blocker for this CEP or anything but did want to put my
> 2 cents in. This CEP is horrifying to me.
>
> I have seen thousands of clusters across multiple companies and helped
> them get working successfully. A vast majority of that involved blocking
> the use of MVs, GROUP BY, secondary indexes, and even just simple _range
> queries_. The "unncessary restrictions of cql" are not only necessary IMHO,
> more restrictions are necessary to be successful at scale. The idea of just
> opening up CQL to general purpose relational queries and lines like 
> "supporting
> queries with joins in an efficient way" ... I would really like us to
> make secondary indexes be a viable option before we start opening up
> floodgates on stuff like this.
>
> Chris
>
> On Thu, Dec 14, 2023 at 9:37 AM Benedict  wrote:
>
>> > So yes, this physical plan is the structure that you have in mind but
>> the idea of sharing it is not part of the CEP.
>>
>>
>> I think it should be. This should form a major part of the API on which
>> any CBO is built.
>>
>>
>> > It seems that there is a difference between the goal of your proposal
>> and the one of the CEP. The goal of the CEP is first to ensure optimal
>> performance. It is ok to change the execution plan for one that delivers
>> better performance. What we want to minimize is having a node performing
>> queries in an inefficient way for a long period of time.
>>
>>
>> You have made a goal of the CEP synchronising summary statistics across
>> the whole cluster in order to achieve some degree of uniformity of query
>> plan. So this is explicitly a goal of the CEP, and synchronising summary
>> statistics is a hard problem and won’t provide strong guarantees.
>>
>>
>> > The client side proposal targets consistency for a given query on a
>> given driver instance. In practice, it would be possible to have 2 similar
>> queries with 2 different execution plans on the same driver
>>
>>
>> This would only be possible if the driver permitted it. A driver could
>> (and should) enforce that it only permits one query plan per query.
>>
>>
>> The opposite is true for your proposal: some queries may begin degrading
>> because they touch specific replicas that optimise the query differently,
>> and this will be hard to debug.
>>
>>
>> On 14 Dec 2023, at 15:30, Benjamin Lerer  wrote:
>>
>> 
>> The binding of the parser output to the schema (what is today the
>> Raw.prepare call) will create the logical plan, expressed as a tree of
>> relational operators. Simplification and normalization will happen on that
>> tree to produce a new equivalent logical plan. That logical plan will be
>> used as input to the optimizer. The output will be a physical plan
>> producing the output specified by the logical plan. A tree of physical
>> operators specifying how the operations should be performed.
>>
>> That physical plan will be stored as part of the statements
>> (SelectStatement, ModificationStatement, ...) in the prepared statement
>> cache. Upon execution, variables will be bound and the
>> RangeCommands/Mutations will be created based on the physical plan.
>>
>> The string representation of a physical plan will effectively represent
>> the output of an EXPLAIN statement but outside of that the physical plan
>> will stay encapsulated within the statement classes.
>> Hints will be parameters provided to the optimizer to enforce some
>> specific choices. Like always using an Index Scan instead of a Table Scan,
>> ignoring the cost comparison.
>>
>> So yes, this 

Future direction for the row cache and OHC implementation

2023-12-14 Thread Ariel Weisberg
Hi,

Now seems like a good time to discuss the future direction of the row cache and 
its only implementation OHC (https://github.com/snazy/ohc).

OHC is currently unmaintained and we don’t have the ability to release maven 
artifacts for it or commit to the original repo. I have reached out to the 
original maintainer about it and it seems like if we want to keep using it we 
will need to start releasing it under a new package from a different repo.

I see four directions we could pursue.

1. Fork OHC and start publishing under a new package name and continue to use it
2. Replace OHC with a different cache implementation like Caffeine which would 
move it on heap
3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
later release
4. Do work to make a row cache not necessary and deprecate it later (or maybe 
now)

I would like to find out what people know about row cache usage in the wild so 
we can use that to inform the future direction as well as the general thinking 
about what we should do with it going forward.

Thanks,
Ariel


Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Dinesh Joshi
> On Dec 14, 2023, at 10:32 AM, Ariel Weisberg  wrote:
> 
> 1. Fork OHC and start publishing under a new package name and continue to use 
> it

Who would fork it? Where would you fork it? My first instinct is that this 
would not be viable path forward.

> 2. Replace OHC with a different cache implementation like Caffeine which 
> would move it on heap

Doesn’t seem optimal but given the advent of newer garbage collectors, we might 
be able to run Cassandra with larger heap sizes and moving this to heap may be 
a non-issue. Someone needs to try it out and measure  the performance impact 
with Zgc or Shenandoah.

> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
> later release

In my experience, Row cache has historically helped in narrow workloads where 
you have really hot rows but in other workloads it can hurt performance. So 
keeping it around may be fine as long as people can disable it.

Moving it on-heap using Caffeine maybe the easiest option here.


Dinesh

Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Jeff Jirsa



> On Dec 14, 2023, at 1:51 PM, Dinesh Joshi  wrote:
> 
> 
>> 
>> On Dec 14, 2023, at 10:32 AM, Ariel Weisberg  wrote:
>> 
>> 1. Fork OHC and start publishing under a new package name and continue to 
>> use it
> 
> Who would fork it? Where would you fork it? My first instinct is that this 
> would not be viable path forward.
> 
>> 2. Replace OHC with a different cache implementation like Caffeine which 
>> would move it on heap
> 
> Doesn’t seem optimal but given the advent of newer garbage collectors, we 
> might be able to run Cassandra with larger heap sizes and moving this to heap 
> may be a non-issue. Someone needs to try it out and measure  the performance 
> impact with Zgc or Shenandoah.
> 
>> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
>> later release
> 
> In my experience, Row cache has historically helped in narrow workloads where 
> you have really hot rows but in other workloads it can hurt performance. So 
> keeping it around may be fine as long as people can disable it.

Especially well with tiny partitions . Once you start slicing / paging the 
benefit usually disappears 


> 
> Moving it on-heap using Caffeine maybe the easiest option here.

That’s what I’d do.


> 
> 
> Dinesh


Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Mick Semb Wever
>
> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in
> a later release
>



I'm for deprecating and removing it.
It constantly trips users up and just causes pain.

Yes it works in some very narrow situations, but those situations often
change over time and again just bites the user.  Without the row-cache I
believe users would quickly find other, more suitable and lasting,
solutions.


Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Dinesh Joshi
I would avoid taking away a feature even if it works in narrow set of 
use-cases. I would instead suggest -

1. Leave it disabled by default.
2. Detect when Row Cache has a low hit rate and warn the operator to turn it 
off. Cassandra should ideally detect this and do it automatically.
3. Move to Caffeine instead of OHC.

I would suggest having this as the middle ground.

> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
> 
>   
>   
>> 
>> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
>> later release
> 
> 
> 
> I'm for deprecating and removing it.
> It constantly trips users up and just causes pain.
> 
> Yes it works in some very narrow situations, but those situations often 
> change over time and again just bites the user.  Without the row-cache I 
> believe users would quickly find other, more suitable and lasting, solutions.



Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Paulo Motta
I like Dinesh's middle ground proposal, since this feature has valid uses.

I'm not familiar with the row caching module, but would it make sense to
take this opportunity to expose this feature as an optional Row Caching
Module, disabled by default with an optional on-heap Caffeine
implementation?

The API would look something like:

RowCachingAPI {
- onRowUpdated(RowKey, Mutation) -> cache row fragment
- onRowDeleted(RowKey) -> evict cached row fragment
- onPartitionDeleted(PartitionKey) -> evict cached partition fragment
- Optional getRow(RowKey) -> return cached row fragment
- Optional> getPartition(PartitionKey, resultSize) ->
return cached partition fragment
}

This could be a potential hook for out-of-process caching.

Would something like this be valuable/feasible?

On Thu, Dec 14, 2023 at 8:09 PM Dinesh Joshi  wrote:

> I would avoid taking away a feature even if it works in narrow set of
> use-cases. I would instead suggest -
>
> 1. Leave it disabled by default.
> 2. Detect when Row Cache has a low hit rate and warn the operator to turn
> it off. Cassandra should ideally detect this and do it automatically.
> 3. Move to Caffeine instead of OHC.
>
> I would suggest having this as the middle ground.
>
> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
>
>
>
>>
>> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in
>> a later release
>>
>
>
>
> I'm for deprecating and removing it.
> It constantly trips users up and just causes pain.
>
> Yes it works in some very narrow situations, but those situations often
> change over time and again just bites the user.  Without the row-cache I
> believe users would quickly find other, more suitable and lasting,
> solutions.
>
>
>


Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Dinesh Joshi
> On Dec 14, 2023, at 5:35 PM, Paulo Motta  wrote:
> 
> This could be a potential hook for out-of-process caching.
> 
> Would something like this be valuable/feasible?

It is certainly feasible. I am not sure about its value.

Dinesh

Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Mick Semb Wever
I would avoid taking away a feature even if it works in narrow set of
> use-cases. I would instead suggest -
>
> 1. Leave it disabled by default.
> 2. Detect when Row Cache has a low hit rate and warn the operator to turn
> it off. Cassandra should ideally detect this and do it automatically.
> 3. Move to Caffeine instead of OHC.
>
> I would suggest having this as the middle ground.
>



Yes, I'm ok with this. (2) can also be a guardrail: soft value when to
warn, hard value when to disable.


Re: [DISCUSS] Replace Sigar with OSHI (CASSANDRA-16565)

2023-12-14 Thread Mick Semb Wever
>
> Are there objections to making this switch and adding a new dependency?
>
> [1] https://github.com/apache/cassandra/pull/2842/files
> [2] https://issues.apache.org/jira/browse/CASSANDRA-16565
>



+1 to removing sigar and to add oshi-core


Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Jon Haddad
I think we should probably figure out how much value it actually provides
by getting some benchmarks around a few use cases along with some
profiling.  tlp-stress has a --rowcache flag that I added a while back to
be able to do this exact test.  I was looking for a use case to profile and
write up so this is actually kind of perfect for me.  I can take a look in
January when I'm back from the holidays.

Jon

On Thu, Dec 14, 2023 at 5:44 PM Mick Semb Wever  wrote:

>
>
>
> I would avoid taking away a feature even if it works in narrow set of
>> use-cases. I would instead suggest -
>>
>> 1. Leave it disabled by default.
>> 2. Detect when Row Cache has a low hit rate and warn the operator to turn
>> it off. Cassandra should ideally detect this and do it automatically.
>> 3. Move to Caffeine instead of OHC.
>>
>> I would suggest having this as the middle ground.
>>
>
>
>
> Yes, I'm ok with this. (2) can also be a guardrail: soft value when to
> warn, hard value when to disable.
>


Re: [DISCUSS] Replace Sigar with OSHI (CASSANDRA-16565)

2023-12-14 Thread guo Maxwell
+1 too

Mick Semb Wever  于2023年12月15日周五 10:01写道:

>
>
>
>>
>> Are there objections to making this switch and adding a new dependency?
>>
>> [1] https://github.com/apache/cassandra/pull/2842/files
>> [2] https://issues.apache.org/jira/browse/CASSANDRA-16565
>>
>
>
>
> +1 to removing sigar and to add oshi-core
>
>


Re: Future direction for the row cache and OHC implementation

2023-12-14 Thread Ariel Weisberg
Hi,

To add some additional context.

The row cache is disabled by default and it is already pluggable, but there 
isn’t a Caffeine implementation present. I think one used to exist and could be 
resurrected.

I personally also think that people should be able to scratch their own itch 
row cache wise so removing it entirely just because it isn’t commonly used 
isn’t the right move unless the feature is very far out of scope for Cassandra.

Auto enabling/disabling the cache is a can of worms that could result in 
performance and reliability inconsistency as the DB enables/disables the cache 
based on heuristics when you don’t want it to. It being off by default seems 
good enough to me.

RE forking, we could create a GitHub org for OHC and then add people to it. 
There are some examples of dependencies that haven’t been contributed to the 
project that live outside like CCM and JAMM.

Ariel

On Thu, Dec 14, 2023, at 5:07 PM, Dinesh Joshi wrote:
> I would avoid taking away a feature even if it works in narrow set of 
> use-cases. I would instead suggest -
> 
> 1. Leave it disabled by default.
> 2. Detect when Row Cache has a low hit rate and warn the operator to turn it 
> off. Cassandra should ideally detect this and do it automatically.
> 3. Move to Caffeine instead of OHC.
> 
> I would suggest having this as the middle ground.
> 
>> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
>> 
>>   
>>   
>>> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
>>> later release
>> 
>> 
>> 
>> I'm for deprecating and removing it.
>> It constantly trips users up and just causes pain.
>> 
>> Yes it works in some very narrow situations, but those situations often 
>> change over time and again just bites the user.  Without the row-cache I 
>> believe users would quickly find other, more suitable and lasting, solutions.