Re: Merging CEP-15 to trunk

2023-01-24 Thread Jeremiah D Jordan
> "hold the same bar for merges into a feature branch as trunk"

I think this is the key point here.  If a feature branch is being treated as if 
it was a release branch with respect to commits that go into it then there 
should be no need to “do extra review pre merge to trunk”.  The feature branch 
should follow what we do for all other things post review and pre merge to 
trunk.

1. Rebase the code
a. If the rebase meant changing a bunch of stuff, ask a reviewer to 
look that over, then continue
b. If the rebase didn’t change anything substantial continue.
2. Run CI on rebased code.
3. Push the code to trunk.

-Jeremiah


> On Jan 24, 2023, at 4:10 PM, Josh McKenzie  wrote:
> 
> Cordial debate! <3
> 
>> - it's nevertheless the case that those contributors who didn't actively 
>> work on Accord, have assumed that they will be invited to review now, when 
>> the code is about to land in trunk. Not allowing that to happen would make 
>> them feel like they weren't given the opportunity and that the process in 
>> Cassandra Project Governance was bypassed. We can agree to work differently 
>> in the future, but this is the reality now.
> If this was a miscommunication on this instance rectifying it will of course 
> require compromise from all parties. Good learning for future engagement and 
> hopefully the outcome of this discussion is clearer norms as a project so we 
> don't end up with this miscommunication in the future.
> 
>> the code is of the highest quality and ready to be merged to trunk, I don't 
>> think that can be expected of every feature branch all the time
> I think this is something we can either choose to make a formal requirement 
> for feature branches in ASF git (all code that goes in has 2 committers hands 
> on) or not. If folks want to work on other feature branches in other forks 
> w/out this bar and then have a "mega review" at the end, I suppose that's 
> their prerogative. Many of us that have been on the project for years have 
> _significant emotional and psychological scars_ from that approach however, 
> and multiple large efforts have failed at the "mega-review and merge" step. 
> So I wouldn't advocate for that approach (and it's the only logical 
> alternative I can think of to incremental bar of quality reinforcement 
> throughout a work cycle on a large feature over time).
> 
>> if it had been ready to merge to trunk already a year ago, why wasn't it? 
>> It's kind of the point of using a feature branch that the code in it is NOT 
>> ready to be merged yet
> Right now we culturally tend to avoid merging code that doesn't do anything, 
> for better or worse. We don't have a strong culture of either incremental 
> merge in during development or of using the experimental flag for new 
> features. Much of the tightly coupled nature of our codebase makes this a 
> necessity for keeping velocity while working unfortunately. So in this case I 
> would qualify that "it's not ready to be merged yet given our assumption that 
> all code in the codebase should serve an active immediate purpose, not due to 
> a lack of merge-level quality".
> 
> The approach of "hold the same bar for merges into a feature branch as trunk" 
> seems to be a compromise between Big Bang single commit drops and peppering 
> trunk with a lot of "as yet dormant" incremental code as a large feature is 
> built out. Not saying it's better or worse, just describing the contour of 
> the tradeoffs as I see them.
> 
>> - Uncertainty: It's completely ok that some feature branches may be 
>> abandoned without ever merging to trunk. Requiring the community (anyone 
>> potentially interested, anyways) to review such code would obviously be a 
>> waste of precious talent.
> This is an excellent point. The only mitigation I'd see for this would be an 
> additional review period or burden collectively before merge of a feature 
> branch into trunk once something has crossed a threshold of success as to be 
> included, or stepping away from a project where you don't have the cycles to 
> stay up to date and review and trust that the other committers working on the 
> project are making choices that are palatable and acceptable to you.
> 
> If all API decisions hit the dev ML and the architecture conforms generally 
> to the specification of the CEP, it seems to me that stepping back and 
> trusting your fellow committers to Do The Right Thing is the optimal (and 
> scalable) approach here?
> 
>> Let's say someone in October 2021 was invested in the quality of Cassandra 
>> 4.1 release. Should this person now invest in reviewing Accord or not? It's 
>> impossible to know. Again, in hindsight we know that the answer is no, but 
>> your suggestion again would require the person to review all active feature 
>> branches just in case.
> I'd argue that there's 3 times to really invest in the quality of any 
> Cassandra release:
> 1. When we set agreed upon bars for quality we'll all hold ourselves 
> accou

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2023-02-02 Thread Jeremiah D Jordan
I think we need a DISCUSS thread at minimum for API changes.  And for anything 
changing CQL syntax, I think a CEP is warranted.  Even if it is only a small 
change to the syntax.

> On Feb 2, 2023, at 9:32 AM, Patrick McFadin  wrote:
> 
> API changes are near and dear to my world. The scope of changes could be 
> minor or major, so I think B is the right way forward. 
> 
> Not to throw off the momentum, but could this even warrant a separate CEP in 
> some cases? For example, CEP-15 is a huge change, but the CQL syntax will 
> continuously evolve with more use. Being judicious in those changes is good 
> for end users. It's also a good reference to point back to after the fact. 
> 
> Patrick
> 
> On Thu, Feb 2, 2023 at 6:01 AM Ekaterina Dimitrova  > wrote:
>> “ Only that it locks out of the conversation anyone without a Jira login”
>> Very valid point I forgot about - since recently people need invitation in 
>> order to create account…
>> Then I would say C until we clarify the scope. Thanks
>> 
>> On Thu, 2 Feb 2023 at 8:54, Benedict > > wrote:
>>> I think lazy consensus is fine for all of these things. If a DISCUSS thread 
>>> is crickets, or just positive responses, then definitely it can proceed 
>>> without further ceremony.
>>> 
>>> I think “with heads-up to the mailing list” is very close to B? Only that 
>>> it locks out of the conversation anyone without a Jira login.
>>> 
 On 2 Feb 2023, at 13:46, Ekaterina Dimitrova >>> > wrote:
 
 
>>> 
 While I do agree with you, I am thinking that if we include many things 
 that we would expect lazy consensus on I would probably have different 
 preference. 
 
 I definitely don’t mean to stall this though so in that case:
 I’d say combination of A+C (jira with heads up on the ML if someone is 
 interested into the jira) and regular log on API changes separate from 
 CHANGES.txt or we can just add labels to entries in CHANGES.txt as some 
 other projects. (I guess this is a detail we can agree on later on, how to 
 implement it, if we decide to move into that direction)
 
 On Thu, 2 Feb 2023 at 8:12, Benedict >>> > wrote:
> I think it’s fine to separate the systems from the policy? We are 
> agreeing a policy for systems we want to make guarantees about to our 
> users (regarding maintenance and compatibility)
> 
> For me, this is (at minimum) CQL and virtual tables. But I don’t think 
> the policy differs based on the contents of the list, and given how long 
> this topic stalled for. Given the primary point of contention seems to be 
> the *policy* and not the list, I think it’s time to express our opinions 
> numerically so we can move the conversation forwards.
> 
> This isn’t binding, it just reifies the community sentiment.
> 
>> On 2 Feb 2023, at 13:02, Ekaterina Dimitrova > > wrote:
>> 
>> 
> 
>> “ So we can close out this discussion, let’s assume we’re only 
>> discussing any interfaces we want to make promises for. We can have a 
>> separate discussion about which those are if there is any disagreement.”
>> May I suggest we first clear this topic and then move to voting? I would 
>> say I see confusion, not that much of a disagreement. Should we raise a 
>> discussion for every feature flag for example? In another thread virtual 
>> tables were brought in. I saw also other examples where people expressed 
>> uncertainty. I personally feel I’ll be able to take a more informed 
>> decision and vote if I first see this clarified. 
>> 
>> I will be happy to put down a document and bring it for discussion if 
>> people agree with that
>> 
>> 
>> 
>> On Thu, 2 Feb 2023 at 7:33, Aleksey Yeshchenko > > wrote:
>>> Bringing light to new proposed APIs no less important - if not more, 
>>> for reasons already mentioned in this thread. For it’s not easy to 
>>> change them later.
>>> 
>>> Voting B.
>>> 
>>> 
 On 2 Feb 2023, at 10:15, Andrés de la Peña >>> > wrote:
 
 If it's a breaking change, like removing a method or property, I think 
 we would need a DISCUSS API thread prior to making changes. However, 
 if the change is an addition, like adding a new yaml property or a JMX 
 method, I think JIRA suffices.
>>> 



Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-07 Thread Jeremiah D Jordan
+1 nb

> On Feb 6, 2023, at 10:15 AM, Sam Tunnicliffe  wrote:
> 
> Hi everyone,
> 
> I would like to start a vote on this CEP.
> 
> Proposal:
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster+Metadata
> 
> Discussion:
> https://lists.apache.org/thread/h25skwkbdztz9hj2pxtgh39rnjfzckk7
> 
> The vote will be open for 72 hours.
> A vote passes if there are at least three binding +1s and no binding vetoes.
> 
> Thanks,
> Sam



Re: Downgradability

2023-02-21 Thread Jeremiah D Jordan
If we can get opt-in major format upgrades, as well as an offline 
sstabledowngrade tool, I think we have a good first step that would make 
downgrades possible.

Given Jacek’s work on the sstable format API, and the work from Yuki and Claude 
on old formats, I think we are pretty close to having both of those be viable?

I think with the opt-in major format upgrades, the main thing will be to ensure 
that all new features that were built around the new format either fail 
gracefully, or for a change in behavior opt to the old behavior until the new 
format is available?  If a new feature is using a feature flag this could be a 
simple check to throw a configuration exception if the feature is enabled, but 
the new sstable format is not available.

No features have yet merged that bump the sstable major version, but a few are 
finishing up that will.  Do we want to block merging those changes until 
discussions here finish?  I don’t think that we need to?  The ticket which 
brings in the ability to opt-in to the sstable format change can also fix up 
the existing code to check the flag?

-Jeremiah


> On Feb 21, 2023, at 10:29 AM, Benedict  wrote:
> 
> As always, Scott puts it much more eloquently than I can. 
> 
> The only thing I’d quibble with is that I think it is better to make changes 
> backwards compatible, rather than make earlier releases forwards compatible - 
> and where this is prohibitively costly to simply make a feature that depends 
> on it unavailable until the switch to the new major format.
> 
> This provides the greatest flexibility for users, as they can upgrade from 
> and downgrade to the same versions. There’s no scrambling for a different 
> downgrade target you haven’t qualified when finding out there’s an 
> unacceptable bug. 
> There’s also less delta between pre-upgrade and post-downgrade behaviour.
> 
> We have plenty of practice doing this kind of thing. It’s not that hard.
> 
> But, if we want to go the forward compatibility route that’s still far better 
> than nothing.
> 
> 
>> On 21 Feb 2023, at 16:17, C. Scott Andreas  wrote:
>> 
>> 
>> I realize my feedback on this has been spread across tickets and older 
>> mailing list / wiki discussions, so I'll offer a proposal here.
>> 
>> Starting with goals -
>> 
>> 1. Cassandra users must be able to abort and revert an upgrade to a new 
>> version of the database that introduces a new major SSTable format.
>> 
>> This reduces risk of upgrading to a build that also introduces a 
>> non-data-format-related bug that is intolerable. This goal does not specify 
>> a mechanism or downgrade target - just the "downgradability" goal.
>> 
>> 2. Where possible, Cassandra users should be able to opt into writing of a 
>> new major SSTable format.
>> 
>> This reduces that risk further by allowing users to decouple data format 
>> changes from the upgrade itself. There may be cases where new features or 
>> bug fixes prevent this from being possible, but I'll offer it as a goal.
>> 
>> 3. It should be possible for users to perform the downgrade in-place by 
>> launching the database using a previous version's binary.
>> 
>> This avoids the need for complex orchestration of offline commands like a 
>> hypothetical `downgradesstables`.
>> 
>> 
>> The following approach would allow us to accomplish these goals:
>> 
>> 1. Major SSTable changes should begin with forward-compatibility in a prior 
>> release.
>> 
>> In a release prior to one that revs major SSTable versions, we should 
>> implement the ability to read the SSTables that we intend to write in the 
>> next major version. This would allow someone to (eg.,) revert from 5.0 to 
>> 4.2 if they encountered a regression that caused an outage without data 
>> loss. This downgrade path should be well-specified and called out in 
>> NEWS.txt.
>> 
>> 2. Where possible, major SSTable format changes should be opt-in (if the 
>> features / bugfixes introduced allow).
>> 
>> This would be via a flag to enable writing the new format once an operator 
>> has determined that post-upgrade their clusters are sufficiently stable. 
>> This is an approach that HDFS has adopted. Following a rolling upgrade of 
>> HDFS, downgrade remains possible until an operator executes a "finalize" 
>> operation to migrate NameNode metadata to the new version's. An approach 
>> like this would allow users to perform a staged upgrade in which they first 
>> rev the version of the database, followed by opting into its new format to 
>> derisk (eg.,) adoption of BTI-indexed SSTables.
>> 
>> These approaches aren't meant to discourage SSTable format evolution - but 
>> to make it safer, and ideally faster. They don't specify duplicative 
>> serialization or a game of Twister to hide fields in locations where old 
>> versions don't think to look. Forward compatibility in a prior release could 
>> be landed at the same time as the major format revision itself, so long as 
>> we cut releases from both branches.
>> 
>> Ability

Re: Downgradability

2023-02-22 Thread Jeremiah D Jordan
We have multiple tickets about to merge that introduce new on disk format 
changes.  I see no reason to block those indefinitely while we figure out how 
to do the on disk format downgrade stuff.

-Jeremiah

> On Feb 22, 2023, at 3:12 PM, Benedict  wrote:
> 
> Ok I will be honest, I was fairly sure we hadn’t yet broken downgrade - but I 
> was wrong. CASSANDRA-18061 introduced a new column to a system table, which 
> is a breaking change. 
> 
> But that’s it, as far as I can tell. I have run a downgrade test successfully 
> after reverting that ticket, using the one line patch below. This makes every 
> in-jvm upgrade test also a downgrade test. I’m sure somebody more familiar 
> with dtests can readily do the same there.
> 
> While we look to fix 18061 and enable downgrade tests (and get a clean run of 
> the full suite), can we all agree not to introduce new breaking changes?
> 
> 
> index e41444fe52..085b25f8af 100644
> --- 
> a/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java
> +++ 
> b/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java
> @@ -104,6 +104,7 @@ public class UpgradeTestBase extends DistributedTestBase
>  
> .addEdge(v40, v41)
>  
> .addEdge(v40, v42)
>  
> .addEdge(v41, v42)
> + 
> .addEdge(v42, v41)
>  
> .build();
> 
> 
>> On 22 Feb 2023, at 15:08, Jeff Jirsa  wrote:
>> 
>> When people are serious about this requirement, they’ll build the downgrade 
>> equivalents of the upgrade tests and run them automatically, often, so 
>> people understand what the real gap is and when something new makes it break 
>> 
>> Until those tests exist, I think collectively we should all stop pretending 
>> like this is dogma. Best effort is best effort. 
>> 
>> 
>> 
>>> On Feb 22, 2023, at 6:57 AM, Branimir Lambov  
>>> wrote:
>>> 
>>> 
>>> > 1. Major SSTable changes should begin with forward-compatibility in a 
>>> > prior release.
>>> 
>>> This requires "feature" changes, i.e. new non-trivial code for previous 
>>> patch releases. It also entails porting over any further format 
>>> modification.
>>> 
>>> Instead of this, in combination with your second point, why not implement 
>>> backwards write compatibility? The opt-in is then clearer to define (i.e. 
>>> upgrades start with e.g. a "4.1-compatible" settings set that includes file 
>>> format compatibility and disabling of new features, new nodes start with 
>>> "current" settings set). When the upgrade completes and the user is happy 
>>> with the result, the settings set can be replaced.
>>> 
>>> Doesn't this achieve what you want (and we all agree is a worthy goal) with 
>>> much less effort for everyone? Supporting backwards-compatible writing is 
>>> trivial, and we even have a proof-of-concept in the stats metadata 
>>> serializer. It also simplifies by a serious margin the amount of work and 
>>> thinking one has to do when a format improvement is implemented -- e.g. the 
>>> TTL patch can just address this in exactly the way the problem was 
>>> addressed in earlier versions of the format, by capping to 2038, without 
>>> any need to specify, obey or test any configuration flags.
>>> 
>>> >> It’s a commitment, and it requires every contributor to consider it as 
>>> >> part of work they produce.
>>> 
>>> > But it shouldn't be a burden. Ability to downgrade is a testable problem, 
>>> > so I see this work as a function of the suite of tests the project is 
>>> > willing to agree on supporting.
>>> 
>>> I fully agree with this sentiment, and I feel that the current "try to not 
>>> introduce breaking changes" approach is adding the burden, but not the 
>>> benefits -- because the latter cannot be proven, and are most likely 
>>> already broken.
>>> 
>>> Regards,
>>> Branimir
>>> 
>>> On Wed, Feb 22, 2023 at 1:01 AM Abe Ratnofsky >> > wrote:
 Some interesting existing work on this subject is "Understanding and 
 Detecting Software Upgrade Failures in Distributed Systems" - 
 https://dl.acm.org/doi/10.1145/3477132.3483577 
 ,
  also summarized by Andrey Satarin here: 
 https://asatarin.github.io/talks/2022-09-upgrade-failures-in-distributed-systems/
  
 
 
 They specifically tested Cassandra upgra

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Jeremiah D Jordan
I agree with Paulo, it would be nice if we could figure out some way to make 
new NTS work correctly, with a parameter to fall back to the “bad” behavior, so 
that people restoring backups to a new cluster can get the right behavior to 
match their backups.
The problem with only fixing this in a new strategy is we have a ton of 
tutorials and docs out there which tell people to use NTS, so it would be great 
if we could keep “use NTS” as the recommendation.  Throwing a warning when 
someone uses NTS is kind of user hostile.  If someone just read some tutorial 
or doc which told them “make your key space this way” and then when they do 
that the database yells at them telling them they did it wrong, it is not a 
great experience.

-Jeremiah

> On Mar 7, 2023, at 10:16 AM, Benedict  wrote:
> 
> My view is that if this is a pretty serious bug. I wonder if transactional 
> metadata will make it possible to safely fix this for users without 
> rebuilding (only via opt-in, of course).
> 
>> On 7 Mar 2023, at 15:54, Miklosovic, Stefan  
>> wrote:
>> 
>> Thanks everybody for the feedback.
>> 
>> I think that emitting a warning upon keyspace creation (and alteration) 
>> should be enough for starters. If somebody can not live without 100% bullet 
>> proof solution over time we might choose some approach from the offered 
>> ones. As the saying goes there is no silver bullet. If we decide to 
>> implement that new strategy, we would probably emit warnings anyway on NTS 
>> but it would be already done so just new strategy would be provided.
>> 
>> 
>> From: Paulo Motta 
>> Sent: Monday, March 6, 2023 17:48
>> To: dev@cassandra.apache.org
>> Subject: Re: Degradation of availability when using NTS and RF > number of 
>> racks
>> 
>> NetApp Security WARNING: This is an external email. Do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe.
>> 
>> 
>> 
>> It's a bit unfortunate that NTS does not maintain the ability to lose a rack 
>> without loss of quorum for RF > #racks > 2, since this can be easily 
>> achieved by evenly placing replicas across all racks.
>> 
>> Since RackAwareTopologyStrategy is a superset of NetworkTopologyStrategy, 
>> can't we just use the new correct placement logic for newly created 
>> keyspaces instead of having a new strategy?
>> 
>> The placement logic would be backwards-compatible for RF <= #racks. On 
>> upgrade, we could mark existing keyspaces with RF > #racks with 
>> use_legacy_replica_placement=true to maintain backwards compatibility and 
>> log a warning that the rack loss guarantee is not maintained for keyspaces 
>> created before the fix. Old keyspaces with RF <=#racks would still work with 
>> the new replica placement. The downside is that we would need to keep the 
>> old NTS logic around, or we could eventually deprecate it and require users 
>> to migrate keyspaces using the legacy placement strategy.
>> 
>> Alternatively we could have RackAwareTopologyStrategy and fail NTS keyspace 
>> creation for RF > #racks and indicate users to use RackAwareTopologyStrategy 
>> to maintain the quorum guarantee on rack loss or set an override flag 
>> "support_quorum_on_rack_loss=false". This feels a bit iffy though since it 
>> could potentially confuse users about when to use each strategy.
>> 
>> On Mon, Mar 6, 2023 at 5:51 AM Miklosovic, Stefan 
>> mailto:stefan.mikloso...@netapp.com>> wrote:
>> Hi all,
>> 
>> some time ago we identified an issue with NetworkTopologyStrategy. The 
>> problem is that when RF > number of racks, it may happen that NTS places 
>> replicas in such a way that when whole rack is lost, we lose QUORUM and data 
>> are not available anymore if QUORUM CL is used.
>> 
>> To illustrate this problem, lets have this setup:
>> 
>> 9 nodes in 1 DC, 3 racks, 3 nodes per rack. RF = 5. Then, NTS could place 
>> replicas like this: 3 replicas in rack1, 1 replica in rack2, 1 replica in 
>> rack3. Hence, when rack1 is lost, we do not have QUORUM.
>> 
>> It seems to us that there is already some logic around this scenario (1) but 
>> the implementation is not entirely correct. This solution is not computing 
>> the replica placement correctly so the above problem would be addressed.
>> 
>> We created a draft here (2, 3) which fixes it.
>> 
>> There is also a test which simulates this scenario. When I assign 256 tokens 
>> to each node randomly (by same mean as generatetokens command uses) and I 
>> try to compute natural replicas for 1 billion random tokens and I compute 
>> how many cases there will be when 3 replicas out of 5 are inserted in the 
>> same rack (so by losing it we would lose quorum), for above setup I get 
>> around 6%.
>> 
>> For 12 nodes, 3 racks, 4 nodes per rack, rf = 5, this happens in 10% cases.
>> 
>> To interpret this number, it basically means that with such topology, RF and 
>> CL, when a random rack fails completely, when doing a random read, there is 

Re: Degradation of availability when using NTS and RF > number of racks

2023-03-07 Thread Jeremiah D Jordan
Right, why I said we should make NTS do the right thing, rather than throwing a 
warning.  Doing the right thing, and not getting a warning, is the best 
behavior.

> On Mar 7, 2023, at 11:12 AM, Derek Chen-Becker  wrote:
> 
> I think that the warning would only be thrown in the case where a potentially 
> QUORUM-busting configuration is used. I think it would be a worse experience 
> to not warn and let the user discover later when they can't write at QUORUM.
> 
> Cheers,
> 
> Derek
> 
> On Tue, Mar 7, 2023 at 9:32 AM Jeremiah D Jordan  <mailto:jeremiah.jor...@gmail.com>> wrote:
>> I agree with Paulo, it would be nice if we could figure out some way to make 
>> new NTS work correctly, with a parameter to fall back to the “bad” behavior, 
>> so that people restoring backups to a new cluster can get the right behavior 
>> to match their backups.
>> The problem with only fixing this in a new strategy is we have a ton of 
>> tutorials and docs out there which tell people to use NTS, so it would be 
>> great if we could keep “use NTS” as the recommendation.  Throwing a warning 
>> when someone uses NTS is kind of user hostile.  If someone just read some 
>> tutorial or doc which told them “make your key space this way” and then when 
>> they do that the database yells at them telling them they did it wrong, it 
>> is not a great experience.
>> 
>> -Jeremiah
>> 
>> > On Mar 7, 2023, at 10:16 AM, Benedict > > <mailto:bened...@apache.org>> wrote:
>> > 
>> > My view is that if this is a pretty serious bug. I wonder if transactional 
>> > metadata will make it possible to safely fix this for users without 
>> > rebuilding (only via opt-in, of course).
>> > 
>> >> On 7 Mar 2023, at 15:54, Miklosovic, Stefan > >> <mailto:stefan.mikloso...@netapp.com>> wrote:
>> >> 
>> >> Thanks everybody for the feedback.
>> >> 
>> >> I think that emitting a warning upon keyspace creation (and alteration) 
>> >> should be enough for starters. If somebody can not live without 100% 
>> >> bullet proof solution over time we might choose some approach from the 
>> >> offered ones. As the saying goes there is no silver bullet. If we decide 
>> >> to implement that new strategy, we would probably emit warnings anyway on 
>> >> NTS but it would be already done so just new strategy would be provided.
>> >> 
>> >> 
>> >> From: Paulo Motta > >> <mailto:pauloricard...@gmail.com>>
>> >> Sent: Monday, March 6, 2023 17:48
>> >> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>
>> >> Subject: Re: Degradation of availability when using NTS and RF > number 
>> >> of racks
>> >> 
>> >> NetApp Security WARNING: This is an external email. Do not click links or 
>> >> open attachments unless you recognize the sender and know the content is 
>> >> safe.
>> >> 
>> >> 
>> >> 
>> >> It's a bit unfortunate that NTS does not maintain the ability to lose a 
>> >> rack without loss of quorum for RF > #racks > 2, since this can be easily 
>> >> achieved by evenly placing replicas across all racks.
>> >> 
>> >> Since RackAwareTopologyStrategy is a superset of NetworkTopologyStrategy, 
>> >> can't we just use the new correct placement logic for newly created 
>> >> keyspaces instead of having a new strategy?
>> >> 
>> >> The placement logic would be backwards-compatible for RF <= #racks. On 
>> >> upgrade, we could mark existing keyspaces with RF > #racks with 
>> >> use_legacy_replica_placement=true to maintain backwards compatibility and 
>> >> log a warning that the rack loss guarantee is not maintained for 
>> >> keyspaces created before the fix. Old keyspaces with RF <=#racks would 
>> >> still work with the new replica placement. The downside is that we would 
>> >> need to keep the old NTS logic around, or we could eventually deprecate 
>> >> it and require users to migrate keyspaces using the legacy placement 
>> >> strategy.
>> >> 
>> >> Alternatively we could have RackAwareTopologyStrategy and fail NTS 
>> >> keyspace creation for RF > #racks and indicate users to use 
>> >> RackAwareTopologyStrategy to maintain the quorum guarantee on rack loss 
>> >> or set an override flag "support_

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Jeremiah D Jordan
It is actually more complicated than just removing the sstable and running 
repair.

In the face of expired tombstones that might be covering data in other sstables 
the only safe way to deal with a bad sstable is wipe the token range in the bad 
sstable and rebuild/bootstrap that range (or wipe/rebuild the whole node which 
is usually the easier way).  If there are expired tombstones in play, it means 
they could have already been compacted away on the other replicas, but may not 
have compacted away on the current replica, meaning the data they cover could 
still be present in other sstables on this node.  Removing the sstable will 
mean resurrecting that data.  And pulling the range from other nodes does not 
help because they can have already compacted away the tombstone, so you won’t 
get it back.

Tl;DR you can’t just remove the one sstable you have to remove all data in the 
token range covered by the sstable (aka all data that sstable may have had a 
tombstone covering).  Then you can stream from the other nodes to get the data 
back.

-Jeremiah

> On Mar 8, 2023, at 7:24 AM, Bowen Song via dev  
> wrote:
> 
> At the moment, when a read error, such as unrecoverable bit error or data 
> corruption, occurs in the SSTable data files, regardless of the 
> disk_failure_policy configuration, manual (or to be precise, external) 
> intervention is required to recover from the error.
> 
> Commonly, there's two approach to recover from such error:
> 
> The safer, but slower recover strategy: replace the entire node.
> The less safe, but faster recover strategy: shut down the node, delete the 
> affected SSTable file(s), and then bring the node back online and run repair.
> Based on my understanding of Cassandra, it should be possible to recover from 
> such error by marking the affected token range in the existing SSTable as 
> "corrupted" and stop reading from them (e.g. creating a "bad block" file or 
> in memory), and then streaming the affected token range from the healthy 
> replicas. The corrupted SSTable file can then be removed upon the next 
> successful compaction involving it, or alternatively an anti-compaction is 
> performed on it to remove the corrupted data.
> 
> The advantage of this strategy is:
> 
> Reduced node down time - node restart or replacement is not needed
> Less data streaming is required - only the affected token range
> Faster recovery time - less streaming and delayed compaction or 
> anti-compaction
> No less safe than replacing the entire node
> This process can be automated internally, removing the need for operator 
> inputs
> The disadvantage is added complexity on the SSTable read path and it may mask 
> disk failures from the operator who is not paying attention to it.
> 
> What do you think about this?
> 



Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Jeremiah D Jordan
Given we need to upgrade to support JDK17 it seems fine to me.  The only 
concern I have is that some of those libraries are already pretty old, for 
example the most recent jna-platform is 5.13.0 and 5.5.0 is almost 4 years old. 
 I think we should we use the most recent versions of all libraries where 
possible?

> On Mar 13, 2023, at 7:42 AM, Mick Semb Wever  wrote:
> 
> JDK17 requires us to update our chronicle-queue dependency: CASSANDRA-18049
> 
> We use chronicle-queue for both audit logging and fql.
> 
> This update pulls in a number of new transitive dependencies.
> 
> affinity-3.23ea1.jar
> asm-analysis-9.2.jar
> asm-commons-9.2.jar
> asm-tree-9.2.jar
> asm-util-9.2.jar
> jffi-1.3.9.jar
> jna-platform-5.5.0.jar
> jnr-a64asm-1.0.0.jar
> jnr-constants-0.10.3.jar
> jnr-ffi-2.2.11.jar
> jnr-x86asm-1.0.2.jar
> posix-2.24ea4.jar
> 
> 
> More info here:
> https://issues.apache.org/jira/browse/CASSANDRA-18049?focusedCommentId=17699393&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17699393
> 
> 
> Objections?



Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-16 Thread Jeremiah D Jordan
-1 on any change which breaks the previously documented usage.
+1 any additions to what the tool can do without breaking previously documented 
behavior.

> On Mar 16, 2023, at 7:42 AM, Josh McKenzie  wrote:
> 
> We could also consider augmenting the tool with new named arguments with the 
> functionality you described and leave the positional usage intact.
> 
> On Thu, Mar 16, 2023, at 6:43 AM, Bowen Song via dev wrote:
>> The documented command options are:
>> 
>> nodetool tablehistograms [  | ]
>> 
>> 
>> 
>> That means one parameter will be treated as dot separated keyspace and 
>> table. Alternatively, two parameters will be treated as the keyspace and 
>> table respectively.
>> 
>> To remain compatible with the documented behaviour, my suggestion is to 
>> change the command options to:
>> 
>> nodetool tablehistograms [  [ [...]] | 
>>  [ [...]]]
>> 
>> Feel free to add the "all except ..." feature to the above.
>> 
>> This doesn't break backward compatibility in documented ways. It only 
>> changes the undocumented behaviour. If someone is using the undocumented 
>> behaviour, they must know things may break when the software is upgraded. We 
>> can just add a line to the NEWS.txt and let them update their scripts.
>> 
>> 
>> On 16/03/2023 08:53, guo Maxwell wrote:
>>> Hello everyone :
>>> The nodetool tablehistograms have one argument which you can fill with only 
>>> one table name with the format "keyspace_name.table_name /keyspace_name 
>>> table_name", so that you can get the table histograms of the specied table.
>>> 
>>> And  if none arguments is set, all the tables' histograms will be print 
>>> out.And if more than 2 arguments (nomatter the format is right or wrong) 
>>> are set , all the tables' histograms will also be print out too(Which is a 
>>> bug In my mind).
>>> 
>>> So the usage of nodetool tablehistograms has some usage restrictions, That 
>>> is either output one , or all informations.
>>> 
>>> As CASSANDRA-18296  
>>> described , I will change the usage of nodetool tablehistograms, which 
>>> support the feature below:
>>> 1. nodetool tablehistograms ks.tb1 ks.tb2  //print out list of tables' 
>>> histograms with format keyspace.table
>>> 2.nodetool tablehistograms ks1 ks2 ks3 ... //print out list of keyspaces 
>>> histograms
>>> 3.nodetool tablehistograms -i ks1 ks2  //print out list of table 
>>> histograms except for the keyspaces list behind the option -i
>>> 4.nodetool tablehistograns -i ks ks.tb // print out list tables' histograms 
>>> except for table in keyspace ks and ks.tb table.
>>> 5.none option specified ,then all tables histograms will be print out.
>>> 
>>> The usage will breaks compatibility with how it was done previously, and as 
>>> this is a user facing tool.
>>> 
>>> So, What do you think? 
>>> 
>>> Thanks~~~



Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Jeremiah D Jordan
> As for precedent - we (including me) have done a lot of stupid shit over the 
> years on this project. Half the time “this is how we’ve historically done X” 
> to me is a strong argument to start doing things differently. This is one 
> such case.

+1.  I definitely agree that this is one area of precedent that we should not 
be following.  The project has historically been fairly hostile towards longer 
upgrade timelines, I am glad to see all the recent conversations where this 
seems to be improving.

-Jeremiah

Re: [EXTERNAL] [DISCUSS] Next release date

2023-03-24 Thread Jeremiah D Jordan
Given the fundamental change to how cluster operations work coming from CEP-21, 
I’m not sure what freezing early for “extra QA time” really buys us?  I 
wouldn’t trust any multi-node QA done pre commit.
What “stabilizing” do we expect to be doing during this time?  How much of it 
do we just have to do again after those things merge?  I for one do not like to 
have release branches cut months before their expected release.  It just adds 
extra merge forward and “where should this go” questions/overhead.  It could 
make sense to me to branch branch when CEP-21 merges and only let in CEP-15 
after that.  CEP-15 is mostly “net new stuff” and not “changes to existing 
stuff” from my understanding?  So no QA effort wasted if it is done before it 
merges.

-Jeremiah

> On Mar 24, 2023, at 9:38 AM, Josh McKenzie  wrote:
> 
>> I would like to propose a partial freeze of 5.0 in June
> My .02:
> +1 to:
> * partial freeze on an agreed upon date w/agreed upon other things that can 
> optionally go in after
> * setting a hard limit on when we ship from that frozen branch regardless of 
> whether the features land or not
> 
> -1 to:
> * ever feature freezing trunk again. :)
> 
> I worry about the labor involved with having very large work like this target 
> a frozen branch and then also needing to pull it up to trunk. That doesn't 
> sound fun.
> 
> If we resurrected the discussion about cutting alpha snapshots from trunk, 
> would that change people's perspectives on the weight of this current 
> decision? We'd probably also have to re-open pandora's box talking about the 
> solidity of our API's on trunk as well if we positioned those alphas as being 
> stable enough to start prototyping and/or building future applications 
> against.
> 
> On Fri, Mar 24, 2023, at 9:59 AM, Brandon Williams wrote:
>> I am +1 on a 5.0 branch freeze.
>> 
>> Kind Regards,
>> Brandon
>> 
>> On Fri, Mar 24, 2023 at 8:54 AM Benjamin Lerer > > wrote:
>> >>
>> >> Would that be a trunk freeze, or freeze of a cassandra-5.0 branch?
>> >
>> >
>> > I was thinking of a cassandra-5.0 branch freeze. So branching 5.0 and 
>> > allowing only CEP-15 and 21 + bug fixes there.
>> > Le ven. 24 mars 2023 à 13:55, Paulo Motta > > > a écrit :
>> >>
>> >> >  I would like to propose a partial freeze of 5.0 in June.
>> >>
>> >> Would that be a trunk freeze, or freeze of a cassandra-5.0 branch? I 
>> >> agree with a branch freeze, but not with trunk freeze.
>> >>
>> >> I might work on small features after June and would be happy to delay 
>> >> releasing these on 5.0+, but delaying merge to trunk until 5.0 is 
>> >> released could be disruptive to contributors workflows and I would prefer 
>> >> to avoid that if possible.
>> >>
>> >> On Fri, Mar 24, 2023 at 6:37 AM Mick Semb Wever > >> > wrote:
>> >>>
>> >>>
>>  I would like to propose a partial freeze of 5.0 in June.
>> 
>>  …
>> 
>>  This partial freeze will be valid for every new feature except CEP-21 
>>  and CEP-15.
>> >>>
>> >>>
>> >>>
>> >>> +1
>> >>>
>> >>> Thanks for summarising the thread this way Benjamin. This addresses my 
>> >>> two main concerns: letting the branch/release date slip too much into 
>> >>> the unknown, squeezing GA QA efforts, while putting in place exceptional 
>> >>> waivers for CEP-21 and CEP-15.
>> >>>
>> >>> I hope that in the future we will be more willing to commit to the 
>> >>> release train model: less concerned about "what the next release 
>> >>> contains"; more comfortable letting big features land where they land. 
>> >>> But this is opinion and discussion for another day… possibly looping 
>> >>> back to the discussion on preview releases…
>> >>>
>> >>>
>> >>> Do we have yet from anyone a (rough) eta on CEP-15 (post CEP-21) landing 
>> >>> in trunk?
>> >>>
>> >>>



Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-24 Thread Jeremiah D Jordan
I have concerns with the majority of this being in the sidecar and not in the 
database itself.  I think it would make sense for the server side of this to be 
a new service exposed by the database, not in the sidecar.  That way it can be 
able to properly integrate with the authentication and authorization apis, and 
to make it a first class citizen in terms of having unit/integration tests in 
the main DB ensuring no one breaks it.

-Jeremiah

> On Mar 24, 2023, at 10:29 AM, Dinesh Joshi  wrote:
> 
> Hi Benjamin,
> 
> I agree with your concern about long term maintenance of the code. Doug
> has contributed several patches to Cassandra over the years. Besides him
> there will be several other maintainers that will take on maintenance of
> this code including Yifan and myself. Given how closely it is coupled
> with the Cassandra Sidecar project, I would prefer that we keep this
> within the Cassandra project umbrella as a separate repository and a
> sub-project.
> 
> Thanks,
> 
> Dinesh
> 
> 
> On 3/24/23 02:35, Benjamin Lerer wrote:
>> Hi Doug,
>> 
>> Outside of the changes to the Cassandra Sidecar that are mentioned, what
>> the CEP proposes is the donation of a library for Spark integration. It
>> seems to me that this library could be offered as an open source project
>> outside of the Cassandra project itself. If we accept Spark Bulk
>> Analytic as part of the Cassandra project it means that the community
>> will commit to maintain it and ensure that for each Cassandra release it
>> will be fully compatible. Considering our history with Hadoop
>> integration which has basically been unmaintained for years, I am not
>> convinced that it is what we should do.
>> We only started to expand the scope of the project recently and I would
>> personally prefer that we do that slowly starting with the drivers that
>> are critical for C*. Now, it is only my personal opinion and other
>> people might have a different view on those things.
>> 
>> Le jeu. 23 mars 2023 à 23:29, Miklosovic, Stefan
>> mailto:stefan.mikloso...@netapp.com> 
>> > a
>> écrit :
>> 
>>Hi,
>> 
>>I think this might be a great contribution in the light of removed
>>Hadoop integration recently (CASSANDRA-18323) as it will not be in
>>5.0 anymore. If this CEP is adopted and delivered, I can see how it
>>might be a logical replacement of that.
>> 
>>Regards
>> 
>>
>>From: Doug Rohrer mailto:droh...@apple.com> 
>> >
>>Sent: Thursday, March 23, 2023 18:33
>>To: dev@cassandra.apache.org  
>> 
>>Cc: James Berragan
>>Subject: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with
>>Spark Bulk Analytics
>> 
>>NetApp Security WARNING: This is an external email. Do not click
>>links or open attachments unless you recognize the sender and know
>>the content is safe.
>> 
>> 
>> 
>> 
>>Hi everyone,
>> 
>>Wiki:
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-28%3A+Reading+and+Writing+Cassandra+Data+with+Spark+Bulk+Analytics
>> 
>>We’d like to propose this CEP for adoption by the community.
>> 
>>It is common for teams using Cassandra to find themselves looking
>>for a way to interact with large amounts of data for analytics
>>workloads. However, Cassandra’s standard APIs aren’t designed for
>>large scale data egress/ingest as the native read/write paths
>>weren’t designed for bulk analytics.
>> 
>>We’re proposing this CEP for this exact purpose. It enables the
>>implementation of custom Spark (or similar) applications that can
>>either read or write large amounts of Cassandra data at line rates,
>>by accessing the persistent storage of nodes in the cluster via the
>>Cassandra Sidecar.
>> 
>>This CEP proposes new APIs in the Cassandra Sidecar and a companion
>>library that allows deep integration into Apache Spark that allows
>>its users to bulk import or export data from a running Cassandra
>>cluster with minimal to no impact to the read/write traffic.
>> 
>>We will shortly publish a branch with code that will accompany this
>>CEP to help readers understand it better.
>> 
>>As a reminder, please keep the discussion here on the dev list vs.
>>in the wiki, as we’ve found it easier to manage via email.
>> 
>>Sincerely,
>> 
>>Doug Rohrer & James Berragan



Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-28 Thread Jeremiah D Jordan
> - Resources isolation. Having the said service running within the same JVM 
> may negatively impact Cassandra storage's performance. It could be more 
> beneficial to have them in Sidecar, which offers strong resource isolation 
> guarantees.

How does having this in a side car change the impact on “storage performance”?  
The side car reading sstables will have the same impact on storage IO as the 
main process reading sstables.  Given the sidecar is running on the same node 
as the main C* process, the only real resource isolation you have is in 
heap/GC?  CPU/Memory/IO are all still shared between the main C* process and 
the side car, and coordinating those across processes is harder than 
coordinating them within a single process.  For example if we wanted to have 
the compaction throughput, streaming throughput, and analytics read throughput 
all tied back to a single disk IO cap, that is harder with an external process.

> - Complexity. Considering the existence of the Sidecar project, it would be 
> less complex to avoid adding another (http?) service in Cassandra.

Not sure that is really very complex, running an http service is a pretty easy? 
 We already have netty in use to instantiate one from.
I worry more about the complexity of having the matching schema for a set of 
sstables being read.  The complexity of new sstable versions/formats being 
introduced.  The complexity of having up to date data from memtables being 
considered by this API without having to flush before every query of it.  The 
complexity of dealing with the new memtable API introduced in CEP-11.  The 
complexity of coordinating compaction/streaming adding and removing files with 
these APIs reading them.  There are a lot of edge cases to consider for this 
external access to sstables that the main process considers itself the “owner” 
of.

All of this is not to say that I think separating things out into other 
processes/services is bad.  But I think we need to be very careful with how we 
do it, or end users will end up running into all the sharp edges and the 
feature will fail.

-Jeremiah

> On Mar 24, 2023, at 8:15 PM, Yifan Cai  wrote:
> 
> Hi Jeremiah, 
> 
> There are good reasons to not have these inside Cassandra. Consider the 
> following.
> - Resources isolation. Having the said service running within the same JVM 
> may negatively impact Cassandra storage's performance. It could be more 
> beneficial to have them in Sidecar, which offers strong resource isolation 
> guarantees.
> - Availability. If the Cassandra cluster is being bounced, using sidecar 
> would not affect the SBR/SBW functionality, e.g. SBR can still read SSTables 
> via sidecar endpoints. 
> - Compatibility. Sidecar provides stable REST-based APIs, such as uploading 
> SSTables endpoint, which would remain compatible with different versions of 
> Cassandra. The current implementation supports versions 3.0 and 4.0.
> - Complexity. Considering the existence of the Sidecar project, it would be 
> less complex to avoid adding another (http?) service in Cassandra.
> - Release velocity. Sidecar, as an independent project, can have a quicker 
> release cycle from Cassandra. 
> - The features in sidecar are mostly implemented based on various existing 
> tools/APIs exposed from Cassandra, e.g. ring, commit sstable, snapshot, etc.
> 
> Regarding authentication and authorization
> - We will add it as a follow-on CEP in Sidecar, but we don't want to hold up 
> this CEP. It would be a feature that benefits all Sidecar endpoints.
> 
> - Yifan
> 
> On Fri, Mar 24, 2023 at 2:43 PM Doug Rohrer  <mailto:droh...@apple.com>> wrote:
>> I agree that the analytics library will need to support vnodes. To be clear, 
>> there’s nothing preventing the solution from working with vnodes right now, 
>> and no assumptions about a 1:1 topology between a token and a node. However, 
>> we don’t, today, have the ability to test vnode support end-to-end. We are 
>> working towards that, however, and should be able to remove the caveat from 
>> the released analytics library once we can properly test vnode support.
>> If it helps, I can update the CEP to say something more like “Caveat: 
>> Currently untested with vnodes - work is ongoing to remove this limitation” 
>> if that helps?
>> 
>> Doug
>> 
>> > On Mar 24, 2023, at 11:43 AM, Brandon Williams > > <mailto:dri...@gmail.com>> wrote:
>> > 
>> > On Fri, Mar 24, 2023 at 10:39 AM Jeremiah D Jordan
>> > mailto:jeremiah.jor...@gmail.com>> wrote:
>> >> 
>> >> I have concerns with the majority of this being in the sidecar and not in 
>> >> the database itself.  I think it would make sense for the server side of 
>> >&

Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-28 Thread Jeremiah D Jordan


>> Given the sidecar is running on the same node as the main C* process, the 
>> only real resource isolation you have is in heap/GC?  CPU/Memory/IO are all 
>> still shared between the main C* process and the side car, and coordinating 
>> those across processes is harder than coordinating them within a single 
>> process. For example if we wanted to have the compaction throughput, 
>> streaming throughput, and analytics read throughput all tied back to a 
>> single disk IO cap, that is harder with an external process.
> 
> Relatively trivial, for CPU and memory, to run them in different 
> containers/cgroups/etc, so you can put an exact cpu/memory limit on the 
> sidecar. That's different from a jmx rate limiter/throttle, but (arguably) 
> more precise, because it actually limits the underlying physical resource 
> instead of a proxy for it in a config setting. 
> 

If we want to bring groups/containers/etc into the default deployment 
mechanisms of C*, great.  I am all for dividing it up into micro services given 
we solve all the problems I listed in the complexity section.

I am actually all for dividing C* up into multiple micro services, but the 
project needs to buy in to containers as the default mechanism for running it 
for that to be viable in my mind.

>  
>> 
>>> - Complexity. Considering the existence of the Sidecar project, it would be 
>>> less complex to avoid adding another (http?) service in Cassandra.
>> 
>> Not sure that is really very complex, running an http service is a pretty 
>> easy?  We already have netty in use to instantiate one from.
>> I worry more about the complexity of having the matching schema for a set of 
>> sstables being read.  The complexity of new sstable versions/formats being 
>> introduced.  The complexity of having up to date data from memtables being 
>> considered by this API without having to flush before every query of it.  
>> The complexity of dealing with the new memtable API introduced in CEP-11.  
>> The complexity of coordinating compaction/streaming adding and removing 
>> files with these APIs reading them.  There are a lot of edge cases to 
>> consider for this external access to sstables that the main process 
>> considers itself the “owner” of.
>> 
>> All of this is not to say that I think separating things out into other 
>> processes/services is bad.  But I think we need to be very careful with how 
>> we do it, or end users will end up running into all the sharp edges and the 
>> feature will fail.
>> 
>> -Jeremiah
>> 
>>> On Mar 24, 2023, at 8:15 PM, Yifan Cai >> <mailto:yc25c...@gmail.com>> wrote:
>>> 
>>> Hi Jeremiah, 
>>> 
>>> There are good reasons to not have these inside Cassandra. Consider the 
>>> following.
>>> - Resources isolation. Having the said service running within the same JVM 
>>> may negatively impact Cassandra storage's performance. It could be more 
>>> beneficial to have them in Sidecar, which offers strong resource isolation 
>>> guarantees.
>>> - Availability. If the Cassandra cluster is being bounced, using sidecar 
>>> would not affect the SBR/SBW functionality, e.g. SBR can still read 
>>> SSTables via sidecar endpoints. 
>>> - Compatibility. Sidecar provides stable REST-based APIs, such as uploading 
>>> SSTables endpoint, which would remain compatible with different versions of 
>>> Cassandra. The current implementation supports versions 3.0 and 4.0.
>>> - Complexity. Considering the existence of the Sidecar project, it would be 
>>> less complex to avoid adding another (http?) service in Cassandra.
>>> - Release velocity. Sidecar, as an independent project, can have a quicker 
>>> release cycle from Cassandra. 
>>> - The features in sidecar are mostly implemented based on various existing 
>>> tools/APIs exposed from Cassandra, e.g. ring, commit sstable, snapshot, etc.
>>> 
>>> Regarding authentication and authorization
>>> - We will add it as a follow-on CEP in Sidecar, but we don't want to hold 
>>> up this CEP. It would be a feature that benefits all Sidecar endpoints.
>>> 
>>> - Yifan
>>> 
>>> On Fri, Mar 24, 2023 at 2:43 PM Doug Rohrer >> <mailto:droh...@apple.com>> wrote:
>>>> I agree that the analytics library will need to support vnodes. To be 
>>>> clear, there’s nothing preventing the solution from working with vnodes 
>>>> right now, and no assumptions about a 1:1 topology between a token and a 
>>>> node. However, we don’t, today, hav

Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-28 Thread Jeremiah D Jordan
> One of the explicit goals of making an official sidecar project was to
> try to make it something the project does not break compatibility with
> as one of the main issues the third-party sidecars (that handle
> distributed control, backup, repair, etc ...) have is they break
> constantly because C* breaks the control interfaces (JMX and config
> files in particular) constantly. If it helps with the mental model,
> maybe think of the Cassandra sidecar as part of the Cassandra
> distribution and we try not to break the distribution? Just like we
> can't break CQL and break the CQL client ecosystem, we hopefully don't
> break control interfaces of the sidecar either.

Do we have tests which enforce this?  I agree we said we won’t break stuff, 
agreeing to something and actually doing it are different things.  We have for 
years said “we won’t break interface X in a patch release”, but we always end 
up doing it if there is no test enforcing the contract with a comment saying 
not to break it.  Without such guards a contributor who has no clue about the 
“what we said” changes it, and the reviewer misses it (and possible also 
doesn’t know/remember “what we said” because we said it 3 years back)…

This is not impossible, we just need to make sure that we are pro-active about 
marking such things.  Maybe the answer is “running the side car integration 
tests” as part of C* patch CI?

> In addition to that, having
> this in a separate process gives us access to easy-to-use OS level
> protections over CPU time, memory, network, and disk via cgroups; as
> well as taking advantage of the existing isolation techniques kernels
> already offer to protect processes from each other e.g. CPU schedulers
> like CFS [1], network qdiscs like tc-fq/tc-prio[2, 3], and io
> schedulers like kyber/bfq [4].

How do we get this tuning to be part of the default install for all users of C* 
+ sidecar?



Re: [POLL] Vector type for ML

2023-05-03 Thread Jeremiah D Jordan
> To be clear, I support the general agreement David and Jonathan seem to have 
> reached.

+1 as well.


> On May 3, 2023, at 3:07 PM, Caleb Rackliffe  wrote:
> 
> To be clear, I support the general agreement David and Jonathan seem to have 
> reached.
> 
> On Wed, May 3, 2023 at 3:05 PM Caleb Rackliffe  > wrote:
>> Did we agree on a CQL syntax?
>> 
>> On Wed, May 3, 2023 at 2:06 PM Rahul Xavier Singh 
>> mailto:rahul.xavier.si...@gmail.com>> wrote:
>>> I like this approach. Thank you for those working on this vector search 
>>> initiative. 
>>> 
>>> Here's the feedback from my "user" hat for someone who is looking at 
>>> databases / indexes for my next LLM app. 
>>> 
>>> Can I take some python code and go from using an in memory vector store 
>>> like numpy or FAISS to something else? How easy is it for me to take my 
>>> python code and get it to work with this new external service which is no 
>>> longer just a library?
>>> There's also tons of services that I can run on docker e.g. milvus, 
>>> redissearch, typesense, elasticsearch, opensearch and I may hit a hurdle 
>>> when trying to do a lot more data, so I look at Cassandra Vector Search. 
>>> Because I am familiar with SQL , Cassandra looks appealing since I can 
>>> potentially use "cql_agent" lib ( to be created for langchain and we're 
>>> looking into that now) or an existing CassandraVectorStore class?
>>> 
>>> In most of these scenarios, if people are using langchain, llamaindex, the 
>>> underlying implementation is not as important since we shield the user from 
>>> CQL data types except at schema creation and most of this libs can be 
>>> opinionated and just suggest a generic schema. 
>>> 
>>> The ideal world is if I can just dump text into a field and do a natural 
>>> language query on it and have my DB do the embeddings for the document, and 
>>> then for the query for me. For now the libs can manage all that and they do 
>>> that well. We just need the interface to stay consistent and be relatively 
>>> easy to query in CQL. The most popular index in LLM retrieval augmented 
>>> patterns is pinecone. You make an index, you upsert, and then you query. 
>>> It's not assumed that you are also giving it content, though you can send 
>>> it metadata to have the document there. 
>>> 
>>> If we can have a similar workflow e.g. create a table with a vector type OR 
>>> create a table with an existing type and then add an index to it, no one is 
>>> going to sleep over it as long as it works. Having the ability to take a 
>>> table that has data, and then add a vector index doesn't make it any 
>>> different than adding a new field since I've got to calculate the 
>>> embeddings anyways. 
>>> 
>>> Would love to see how the CQL ends up looking like. 
>>> Rahul Singh
>>> Chief Executive Officer | Business Platform Architect
>>> m: 202.905.2818 e: rahul.si...@anant.us  li: 
>>> http://linkedin.com/in/xingh 
>>> 
>>>  ca: http://calendly.com/xingh 
>>> 
>>> We create, support, and manage real-time global data & analytics platforms 
>>> for the modern enterprise.
>>> 
>>> Anant | https://anant.us 
>>> 
>>> 3 Washington Circle, Suite 301
>>> Washington, D.C. 20037
>>> 
>>> http://Cassandra.Link 
>>> 
>>>  : The best resources for Apache Cassandra
>>> 
>>> 
>>> On Tue, May 2, 2023 at 6:39 PM Patrick McFadin >> > wrote:
 \o/
 
 Bring it in team. Group hug. 
 
 Now if you'll excuse me, I'm going to go build my preso on how Cassandra 
 is the only distributed database you can do vector search in an ACID 
 transaction. 
 
 Patrick
 
 On Tue, May 2, 2023 at 3:27 PM Jonathan Ellis >>> > wrote:
> I had a call with David.  We agreed that we want a "vector" data type 
> with these properties
> 
> - Fixed length
> - No nulls
> - Random access not supported
> 
> Where we disagreed was on my proposal to restrict vectors to only numeric 
> data.  David's points were that
> 
> (1) He has a use case today for a data type with the other vector 
> properties,
> (2) It doesn't seem reasonable to create two data types with the same 
> properties, one of which is restricted to numerics, and
> (3) The restrictions that I want

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Jeremiah D Jordan
If the consensus is that SAI is the right default index, then we should just 
change CREATE INDEX to be SAI, and legacy 2i to be a CUSTOM INDEX.


> On May 9, 2023, at 4:44 PM, Caleb Rackliffe  wrote:
> 
> Earlier today, Mick started a thread on the future of our index creation DDL 
> on Slack:
> 
> https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019
> 
> At the moment, there are two ways to create a secondary index.
> 
> 1.) CREATE INDEX [IF NOT EXISTS] [name] ON  ()
> 
> This creates an optionally named legacy 2i on the provided table and column.
> 
> ex. CREATE INDEX my_index ON kd.tbl(my_text_col)
> 
> 2.) CREATE CUSTOM INDEX [IF NOT EXISTS] [name] ON  () USING 
>  [WITH OPTIONS = ]
> 
> This creates a secondary index on the provided table and column using the 
> specified 2i implementation class and (optional) parameters.
> 
> ex. CREATE CUSTOM INDEX my_index ON ks.tbl(my_text_col) USING 
> 'StorageAttachedIndex'
> 
> (Note that the work on SAI added aliasing, so `StorageAttachedIndex` is 
> shorthand for the fully-qualified class name, which is also valid.)
> 
> So what is there to discuss?
> 
> The concern Mick raised is...
> 
> "...just folk continuing to use CREATE INDEX  because they think CREATE 
> CUSTOM INDEX is advanced (or just don't know of it), and we leave users doing 
> 2i (when they think they are, and/or we definitely want them to be, using 
> SAI)"
> 
> To paraphrase, we want people to use SAI once it's available where possible, 
> and the default behavior of CREATE INDEX could be at odds w/ that.
> 
> The proposal we seem to have landed on is something like the following:
> 
> For 5.0:
> 
> 1.) Disable by default the creation of new legacy 2i via CREATE INDEX.
> 2.) Leave CREATE CUSTOM INDEX...USING... available by default.
> 
> (Note: How this would interact w/ the existing secondary_indexes_enabled YAML 
> options isn't clear yet.)
> 
> Post-5.0:
> 
> 1.) Deprecate and eventually remove SASI when SAI hits full feature parity w/ 
> it.
> 2.) Replace both CREATE INDEX and CREATE CUSTOM INDEX w/ something of a 
> hybrid between the two. For example, CREATE INDEX...USING...WITH. This would 
> both be flexible enough to accommodate index implementation selection and 
> prescriptive enough to force the user to make a decision (and wouldn't change 
> the legacy behavior of the existing CREATE INDEX). In this world, creating a 
> legacy 2i might look something like CREATE INDEX...USING `legacy`.
> 3.) Eventually deprecate CREATE CUSTOM INDEX...USING.
> 
> Eventually we would have a single enabled DDL statement for index creation 
> that would be minimal but also explicit/able to handle some evolution.
> 
> What does everyone think?



Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread Jeremiah D Jordan
> If we assume SAI is what we should use by default for the cluster, would it 
> make sense to allow
> 
> CREATE INDEX [IF NOT EXISTS] [name] ON  ()
> 
> But use a new yaml config that switches from legacy to SAI?
> 
> default_2i_impl: sai
> 
> For 5.0 we can default to “legacy” (new features disabled by default), but 
> allow operators to change this to SAI if they desire?

We have server side DESCRIBE now, so if we have DESCRIBE always show every 
index as a CUSTOM INDEX (or some new syntax that specifies your index type) 
then we could definitely go with this “pick the default in the yaml”.  DESCRIBE 
would always be explicit about which index was in use for which column, so 
backup/restore would work no matter what the default was.

I like this idea.

> On May 9, 2023, at 5:11 PM, David Capwell  wrote:
> 
> If we assume SAI is what we should use by default for the cluster, would it 
> make sense to allow
> 
> CREATE INDEX [IF NOT EXISTS] [name] ON  ()
> 
> But use a new yaml config that switches from legacy to SAI?
> 
> default_2i_impl: sai
> 
> For 5.0 we can default to “legacy” (new features disabled by default), but 
> allow operators to change this to SAI if they desire?
> 
>> 2.) Leave CREATE CUSTOM INDEX...USING... available by default.
> 
> For 5.0, I would argue all indexes should be disabled by default and require 
> operators to allow… I am totally cool with a new allow list to allow some 
> impl..
> 
> secondary_indexes_enabled: false
> secondary_indexes_impl_allowed: [] # default, but could allow users to do 
> [’sai’] if they wish to allow sai… this does have weird semantics as it 
> causes _enabled to be ignored… this could also replace _enabled, but what is 
> allowed in the true case isn’t 100% clear?  Maybe you need _enabled=true and 
> this allow list limits what is actually allowed (prob is way more clear)?
> 
> 
>> 2.) Replace both CREATE INDEX and CREATE CUSTOM INDEX w/ something of a 
>> hybrid between the two. For example, CREATE INDEX...USING...WITH. This would 
>> both be flexible enough to accommodate index implementation selection and 
>> prescriptive enough to force the user to make a decision (and wouldn't 
>> change the legacy behavior of the existing CREATE INDEX). In this world, 
>> creating a legacy 2i might look something like CREATE INDEX...USING `legacy`.
> 
> I do not mind a new syntax that tries to be more clear, but the “replace” is 
> what I would push back against… we should keep the 2 existing syntax and not 
> force users to migrate… we can logically merge the 3 syntaxes, but we should 
> not remove the 2 others.
> 
> CREATE INDEX - gets rewritten to CREATE INDEX… USING config.default_2i_imp
> CREATE CUSTOM INDEX` - gets rewritten to new using syntax
> 
>> 3.) Eventually deprecate CREATE CUSTOM INDEX…USING.
> 
> I don’t mind producing a warning telling users its best to use the new 
> syntax, but if its low effort for us to maintain, we should… and since this 
> can be rewritten to the new format in the parser, this should be low effort 
> to support, so we should?
> 
>> On May 9, 2023, at 2:44 PM, Caleb Rackliffe  wrote:
>> 
>> Earlier today, Mick started a thread on the future of our index creation DDL 
>> on Slack:
>> 
>> > href="https://urldefense.com/v3/__https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019__;!!PbtH5S7Ebw!ZNlxRbG0J87XAz6DKq01BGUb2RlhMUAk936ZcYTetiOnZOwwaTeW0KlVxpgB9d8hfqFP7npFTWzb5NjCQA$";>https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019
>> 
>> At the moment, there are two ways to create a secondary index.
>> 
>> 1.) CREATE INDEX [IF NOT EXISTS] [name] ON  ()
>> 
>> This creates an optionally named legacy 2i on the provided table and column.
>> 
>>ex. CREATE INDEX my_index ON kd.tbl(my_text_col)
>> 
>> 2.) CREATE CUSTOM INDEX [IF NOT EXISTS] [name] ON  () USING 
>>  [WITH OPTIONS = ]
>> 
>> This creates a secondary index on the provided table and column using the 
>> specified 2i implementation class and (optional) parameters.
>> 
>>ex. CREATE CUSTOM INDEX my_index ON ks.tbl(my_text_col) USING 
>> 'StorageAttachedIndex'
>> 
>> (Note that the work on SAI added aliasing, so `StorageAttachedIndex` is 
>> shorthand for the fully-qualified class name, which is also valid.)
>> 
>> So what is there to discuss?
>> 
>> The concern Mick raised is...
>> 
>> "...just folk continuing to use CREATE INDEX  because they think CREATE 
>> CUSTOM INDEX is advanced (or just don't know of it), and we leave users 
>> doing 2i (when they think they are, and/or we definitely want them to be, 
>> using SAI)"
>> 
>> To paraphrase, we want people to use SAI once it's available where possible, 
>> and the default behavior of CREATE INDEX could be at odds w/ that.
>> 
>> The proposal we seem to have landed on is something like the following:
>> 
>> For 5.0:
>> 
>> 1.) Disable by default the creation of new legacy 2i via CREATE INDEX.
>> 2.) Leave CREATE CUSTOM INDEX...USING... available by default.
>> 
>> (Note: How this would inte

Re: [VOTE] CEP-29 CQL NOT Operator

2023-05-10 Thread Jeremiah D Jordan
+1 nb

> On May 8, 2023, at 3:52 AM, Piotr Kołaczkowski  wrote:
> 
> Let's vote.
> 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+NOT+operator
> 
> Piotr Kołaczkowski
> e. pkola...@datastax.com
> w. www.datastax.com



Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread Jeremiah D Jordan
> [POLL] Centralize existing syntax or create new syntax?

1.) CREATE INDEX ... USING  WITH OPTIONS...


> [POLL] Should there be a default? (YES/NO)

YES


> [POLL] What do do with the default?

3.) YAML config to override default index (legacy 2i remains the default)

DESCRIBE should always show the full CREATE INDEX … statement with the index 
specified, such that replaying the output of DESCRIBE will not depend on the 
default settings.  This is what we do right now for CREATE TABLE OPTIONS.  
Things you don’t specify get a default, that default may change between 
releases, DESCRIBE shows the full CREATE TABLE with all OPTIONS listed so 
replaying DESCRIBE does not get any defaults.

I don’t agree with the sentiment that a yaml option overriding CQL is bad.  We 
have tons of local node yaml options that change how a given CQL query can act. 
 All of the guardrails, all of the auth settings, tons of other things that 
should truly be in global configuration, but since we don’t have global 
configuration are in the C* yaml file.  “Make sure you set these options the 
same on every node” is the only thing we have right now.  We shouldn’t be 
limiting what we want to allow configuration of because we don’t have global 
config yet.

-Jeremiah

> On May 12, 2023, at 1:36 PM, Caleb Rackliffe  wrote:
> 
> [POLL] Centralize existing syntax or create new syntax?
> 
> 1.) CREATE INDEX ... USING  WITH OPTIONS...
> 2.) CREATE LOCAL INDEX ... USING ... WITH OPTIONS...  (same as 1, but adds 
> LOCAL keyword for clarity and separation from future GLOBAL indexes)
> 
> (In both cases, we deprecate w/ client warnings CREATE CUSTOM INDEX)
> 
> 
> [POLL] Should there be a default? (YES/NO)
> 
> 
> [POLL] What do do with the default?
> 
> 1.) Allow a default, and switch it to SAI (no configurables)
> 2.) Allow a default, and stay w/ the legacy 2i (no configurables)
> 3.) YAML config to override default index (legacy 2i remains the default)
> 4.) YAML config/guardrail to require index type selection (not required by 
> default)
> 
> On Fri, May 12, 2023 at 12:39 PM Mick Semb Wever  > wrote:
>>> 
>>> Given it seems most DBs have a default index (see Postgres, etc.), I tend 
>>> to lean toward having one, but that's me...
>> 
>>  
>> I'm for it too.  Would be nice to enforce the setting is globally uniform to 
>> avoid the per-node problem. Or add a keyspace option. 
>> 
>> For users replaying <5 DDLs this would just require they set the default 
>> index to 2i.
>> This is not a headache, it's a one-off action that can be clearly expressed 
>> in NEWS.
>> It acts as a deprecation warning too.
>> This prevents new uneducated users from creating the unintended index, it 
>> supports existing users, and it does not present SAI as the battle-tested 
>> default.
>> 
>> Agree with the poll, there's a number of different PoVs here already.  I'm 
>> not fond of the LOCAL addition,  I appreciate what it informs, but it's just 
>> not important enough IMHO (folk should be reading up on the index type).



Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Jeremiah D Jordan
So what do we do with feature branch merged tickets in this model?  They stay 
on 5.0-target after close and move to 5.0.0 when the epic is merged and closes?

> On May 18, 2023, at 9:33 AM, Josh McKenzie  wrote:
> 
>> My mental model, though, is that anything that’s not a concrete release 
>> number is a target version. Which is where 5.0 goes wrong - it’s not a 
>> release so it should be a target, but for some reason we use it as a 
>> placeholder to park work arriving in 5.0.0.
> Ahhh.
> 
>> So tickets go to 5.0-target if they target 5.0, and to 5.0.0 once they are 
>> resolved (with additional labels as necessary)
> Adding -target would definitely make things more clear. If we moved to "5.0 
> == unreleased, always move to something on commit" then you still have to 
> find some external source to figure out what's going on w/our versioning.
> 
> I like 5.0-target. Easy to query for "FixVersion = 5.0-target AND type != 
> 'Bug'" to find stragglers after GA is cut to move to 5.1-target.
> 
> Still have the "update children FixVersion for feature branch when branch is 
> merged" bit but that's not so onerous.
> 
> On Thu, May 18, 2023, at 10:28 AM, Benedict wrote:
>> 
>> The .x approach only breaks down for unreleased majors, for which all of our 
>> intuitions breakdown and we rehash it every year.
>> 
>> My mental model, though, is that anything that’s not a concrete release 
>> number is a target version. Which is where 5.0 goes wrong - it’s not a 
>> release so it should be a target, but for some reason we use it as a 
>> placeholder to park work arriving in 5.0.0.
>> 
>> If we instead use 5.0.0 for this purpose, we just need to get 5.0-alpha1 
>> labels added when those releases are cut.
>> 
>> Then I propose we break the confusion in both directions by scrapping 5.0 
>> entirely and introducing 5.0-target.
>> 
>> So tickets go to 5.0-target if they target 5.0, and to 5.0.0 once they are 
>> resolved (with additional labels as necessary)
>> 
>> Simples?
>> 
>>> On 18 May 2023, at 15:21, Josh McKenzie  wrote:
>>> 
 My personal view is that 5.0 should not be used for any resolved tickets - 
 they should go to 5.0-alpha1, since this is the correct release for them. 
 5.0 can then be the target version, which makes more sense given it isn’t 
 a concrete release.
>>> Well now you're just opening Pandora's box about our strange idioms with 
>>> FixVersion usage. ;)
>>> 
 every ticket targeting 5.0 could use fixVersion 5.0.x, since it is pretty 
 clear what this means.
>>> I think this diverges from our current paradigm where "5.x" == next feature 
>>> release, "5.0.x" == next patch release (i.e. bugfix only). Not to say it's 
>>> bad, just an adjustment... which if we're open to adjustment...
>>> 
>>> I'm receptive to transitioning the discussion to that either on this thread 
>>> or another; IMO we remain in a strange and convoluted place with our 
>>> FixVersioning. My understanding of our current practice:
>>> .x is used to denote target version. For example: 5.x, 5.0.x, 5.1.x, 4.1.x
>>> When a ticket is committed, the FixVersion is transitioned to resolve the X 
>>> to the next unreleased version in which it'll release
>>> Weird Things are done to make this work for the release process and release 
>>> manager on feature releases (alpha, beta, etc)
>>> There's no clear fit for feature branch tickets in the above schema
>>> 
>>> And if I take what I think you're proposing here and extrapolate it out:
>>> .0 is used to denote target version. For example: 5.0. 5.0.0. 5.1.0. 4.1.0
>>> This appears to break down for patch releases: we _do_ release .0 versions 
>>> of them rather than alpha/beta/etc, so a ticket targeting 4.1.0 would 
>>> initially mean 2 different things based on resolved vs. unresolved status 
>>> (resolved == in release, unresolved == targeting next unreleased) and that 
>>> distinction would disappear on resolution (i.e. resolved + 4.1.0 would no 
>>> longer definitively mean "contained in .0 release")
>>> When a release is cut, we bulk update FixVersion ending in .0 to the 
>>> release version in which they're contained (not clear how to disambiguate 
>>> the things from the above bullet point)
>>> For feature releases, .0 will transition to -alpha1
>>> One possible solution would be to just no longer release a .0 version of 
>>> things and reserve .0 to indicate "parked". I don't particularly like that 
>>> but it's not the worst.
>>> 
>>> Another possible solution would be to just scrap this approach entirely and 
>>> go with:
>>> FixVersion on unreleased _and still advocated for tickets_ always targets 
>>> the next unreleased version. For other tickets where nobody is advocating 
>>> for their work / inclusion, we either FixVersion "Backlog" or close as 
>>> "Later"
>>> When a release is cut, roll all unresolved tickets w/that FixVersion to the 
>>> next unreleased FixVersion
>>> When we're gearing up to a release, we can do a broad pass on everyth

Re: Vector search demo, and query syntax

2023-05-23 Thread Jeremiah D Jordan
At first I wasn’t sure about using ORDER BY, but the more I think about what is 
actually going on, I think it does make sense.

This also matches up with some ideas that have been floating around about being 
able to ORDER BY a sorted SAI index.

-Jeremiah

> On May 22, 2023, at 2:28 PM, Jonathan Ellis  wrote:
> 
> Hi all,
> 
> I have a branch of vector search based on cep-7-sai at 
> https://github.com/datastax/cassandra/tree/cep-vsearch. Compared to the 
> original POC branch, this one is based on the SAI code that will be mainline 
> soon, and handles distributed scatter/gather.  Updates and deletes to vector 
> values are still not supported.
> 
> I also put together a demo that uses this branch to provide context to 
> OpenAI’s GPT, available here: https://github.com/jbellis/cassgpt.  
> 
> Here is the query that gets executed:
> 
> SELECT id, start, end, text 
> FROM {self.keyspace}.{self.table} 
> WHERE embedding ANN OF %s 
> LIMIT %s
> 
> The more I used the proposed `ANN OF` syntax, the less I liked it.  This is 
> because we don’t want an actual boolean predicate; we just want to order 
> results.  Put another way, `ANN OF` will include all rows of the table given 
> a high enough `LIMIT`, and that makes it a bad fit for expression processing 
> that expects to be able to filter out rows before it starts LIMIT-ing.  And 
> in fact the code to support executing the query looks suspiciously like what 
> you’d want for `ORDER BY`.
> 
> I propose that we adopt `ORDER BY` syntax, supporting it for vector indexes 
> first and eventually for all SAI indexes.  So this query would become
> 
> SELECT id, start, end, text 
> FROM {self.keyspace}.{self.table} 
> ORDER BY embedding ANN OF %s 
> LIMIT %s
> 
> And it would compose with other SAI indexes with syntax like
> 
> SELECT id, start, end, text 
> FROM {self.keyspace}.{self.table} 
> WHERE publish_date > %s
> ORDER BY embedding ANN OF %s 
> LIMIT %s
> 
> Related work:
> 
> This is similar to the approach used by pgvector, except they invented the 
> symbolic operator `<->` that has the same semantics as `ANN OF`.  I am okay 
> with adopting their operator, but I think ANN OF is more readable.
> 
> -- 
> Jonathan Ellis
> co-founder, http://www.datastax.com 
> @spyced



Re: Evolving the client protocol

2018-04-20 Thread Jeremiah D Jordan
The protocol does already support optional/custom payloads to do such things.  
IIRC the zipkin tracing implementation 
https://github.com/thelastpickle/cassandra-zipkin-tracing for example uses this 
to pass the zipkin id to the server.

> On Apr 20, 2018, at 1:02 PM, Max C.  wrote:
> 
> For things like #3, would it be a better idea to propose a generic 
> enhancement for “optional vendor extensions” to the protocol?  These 
> extensions would be negotiated during connection formation and then the 
> driver could (optionally) implement these additional features.  These 
> extensions would be documented separately by the vendor, and the driver’s 
> default behavior would be to ignore any extensions it doesn’t understand.
> 
> With that sort of feature, the Scylla folks (CosmoDB too??) could add 
> extensions to the protocol without forking the protocol spec, (potentially) 
> without forking the drivers, and without laying down a C* roadmap that the C* 
> project hasn’t agreed to.  Someday down the line, if C* implements a given 
> capability, then the corresponding “vendor extension” could be incorporated 
> into the main protocol spec… or not.
> 
> Lots and lots of protocols implement this type of technique — SMTP, IMAP, 
> PNG, Sieve, DHCP, etc.   Maybe this a better way to go?
> 
> - Max
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Planning to port cqlsh to Python 3 (CASSANDRA-10190)

2018-06-01 Thread Jeremiah D Jordan
The community of people doing python development and the community of people 
running Cassandra servers are not the same.  I am not fine riding the coat 
tails of libraries used in python development.  As others have stated we need 
to be following the lead of the OS vendors that people will be deploying 
Cassandra on top of.  And those will not be dropping Python 2 at the end of the 
year.

-Jeremiah

> On Jun 1, 2018, at 12:37 PM, Jonathan Haddad  wrote:
> 
> Both can work.  I did a lot of the work on the port of the Python
> driver's object mapper (formerly cqlengine) to Python 3.  It's
> reasonably straightforward if you use the six library.
> 
> Both pandas and numpy are dropping support for Python 2 at the end of
> this year.  I'm fine with riding on their coattails.
> On Fri, Jun 1, 2018 at 9:21 AM Russell Bateman  wrote:
>> 
>> Support for, but not the very script, right? Because, as gently pointed
>> out by several realists here, Python 2 is far from dead and arguably
>> still the majority usage. That's only just now beginning to change. I
>> think it will be more than 2 years before people begin asking what
>> Python 2 was.
>> 
>> 
>> On 06/01/2018 10:10 AM, Jonathan Haddad wrote:
>>> Supporting both as a next step is logical, removing support for 2 in the
>>> next year or two seems reasonable enough. Gotta rip the band aid off at
>>> some point.
>>> 
>>> On Fri, Jun 1, 2018 at 2:34 AM Michael Burman  wrote:
>>> 
 Hi,
 
 Deprecating in this context does not mean removing it or it being
 replaced by 3 (RHEL 7.x will remain with Python 2.x as default). It
 refers to future versions (>7), but there are none at this point. It
 appears Ubuntu has deviated from Debian in this sense, but Debian has
 not changed yet (likely Debian 10 will, but that's not out yet and has
 no announced release date).
 
 Thus, 2.x still remains the most used version for servers. And servers
 deployed at this point of time will use these versions for years.
 
- Micke
 
 
 On 06/01/2018 10:52 AM, Murukesh Mohanan wrote:
> On 2018/06/01 07:40:04, Michael Burman  wrote:
>> IIRC, there's no major distribution yet that defaults to Python 3 (I
>> think Ubuntu & Debian are still defaulting to Python 2 also). This will
>> happen eventually (maybe), but not yet. Discarding Python 2 support
>> would mean more base-OS work for most people wanting to run Cassandra
>> and that's not a positive thing.
>> 
> Ubuntu since 16.04 defaults to Python 3:
> 
>> Python2 is not installed anymore by default on the server, cloud and
 the touch images, long live Python3! Python3 itself has been upgraded to
 the 3.5 series. -
 https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.ubuntu.com_XenialXerus_ReleaseNotes-23Python-5F3&d=DwIBaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=J5Su6wvm91QrOBcici7HyIiFiyzjrg8UnamYu8qtSRA&s=9OWAbO26grwiI2ly_-gAGBqJP9Mv6KPAKJyQu_OEDPc&e=
> RHEL 7.5 deprecates Python 2 (
 https://urldefense.proofpoint.com/v2/url?u=https-3A__access.redhat.com_documentation_en-2Dus_red-5Fhat-5Fenterprise-5Flinux_7_html_7.5-5Frelease-5Fnotes_chap-2Dred-5Fhat-5Fenterprise-5Flinux-2D7.5-5Frelease-5Fnotes-2Ddeprecated-5Ffunctionality&d=DwIBaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=J5Su6wvm91QrOBcici7HyIiFiyzjrg8UnamYu8qtSRA&s=CDFufWbcvq6VpoLJQVbCQP9rpvIv3ssNtKMQce-1vwU&e=
 ).
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
 --
>>> Jon Haddad
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.rustyrazorblade.com&d=DwIBaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=J5Su6wvm91QrOBcici7HyIiFiyzjrg8UnamYu8qtSRA&s=ElPVVa0MdfruNq11vJS0JQo6LYDBQVJIVMFHQIEHnT4&e=
>>> twitter: rustyrazorblade
>>> 
>> 
> 
> 
> -- 
> Jon Haddad
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.rustyrazorblade.com&d=DwIBaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=J5Su6wvm91QrOBcici7HyIiFiyzjrg8UnamYu8qtSRA&s=ElPVVa0MdfruNq11vJS0JQo6LYDBQVJIVMFHQIEHnT4&e=
> twitter: rustyrazorblade
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: GitHub PR ticket spam

2018-08-06 Thread Jeremiah D Jordan
Oh nice.  I like the idea of keeping it but moving it to the worklog tab.  +1 
on that from me.

> On Aug 6, 2018, at 5:34 AM, Stefan Podkowinski  wrote:
> 
> +1 for worklog option
> 
> Here's an example ticket from Arrow, where they seem to be using the
> same approach:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_ARROW-2D2583&d=DwICaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=wYZwHSze6YITTXgzOrEvfr_onojahtjeJRzGAt8ByzM&s=KWt0xsOv9ESaieg432edGvPhktGkWHxVuLAdNyORiYY&e=
>  
> 
> 
> 
> On 05.08.2018 09:56, Mick Semb Wever wrote:
>>> I find this a bit annoying while subscribed to commits@,
>>> especially since we created pr@ for these kind of messages. Also I don't
>>> really see any value in mirroring all github comments to the ticket.
>> 
>> 
>> I agree with you Stefan. It makes the jira tickets quite painful to read. 
>> And I tend to make comments on the commits rather than the PRs so to avoid 
>> spamming back to the jira ticket.
>> 
>> But the linking to the PR is invaluable. And I can see Ariel's point about a 
>> chronological historical archive.
>> 
>> 
>>> Ponies would be for this to be mirrored to a tab 
>>> separate from comments in JIRA.
>> 
>> 
>> Ariel, that would be the the "worklog" option.
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__reference.apache.org_pmc_github&d=DwICaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=wYZwHSze6YITTXgzOrEvfr_onojahtjeJRzGAt8ByzM&s=1lWQawAO9fITzakpnmdzERuCbZs6IGQsUH_EEIMCMqs&e=
>>  
>> 
>> 
>> If this works for you, and others, I can open a INFRA to switch to worklog.
>> wdyt?
>> 
>> 
>> Mick.
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org 
>> 
>> For additional commands, e-mail: dev-h...@cassandra.apache.org 
>> 
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: dev-h...@cassandra.apache.org 
> 


Re: Proposing an Apache Cassandra Management process

2018-08-17 Thread Jeremiah D Jordan
Not sure why the two things being in the same repo means they need the same 
release process.  You can always do interim releases of the management artifact 
between server releases, or even have completely decoupled releases.

-Jeremiah

> On Aug 17, 2018, at 10:52 AM, Blake Eggleston  wrote:
> 
> I'd be more in favor of making it a separate project, basically for all the 
> reasons listed below. I'm assuming we'd want a management process to work 
> across different versions, which will be more awkward if it's in tree. Even 
> if that's not the case, keeping it in a different repo at this point will 
> make iteration easier than if it were in tree. I'd imagine (or at least hope) 
> that validating the management process for release would be less difficult 
> than the main project, so tying them to the Cassandra release cycle seems 
> unnecessarily restrictive.
> 
> 
> On August 17, 2018 at 12:07:18 AM, Dinesh Joshi 
> (dinesh.jo...@yahoo.com.invalid) wrote:
> 
>> On Aug 16, 2018, at 9:27 PM, Sankalp Kohli  wrote: 
>> 
>> I am bumping this thread because patch has landed for this with repair 
>> functionality. 
>> 
>> I have a following proposal for this which I can put in the JIRA or doc 
>> 
>> 1. We should see if we can keep this in a separate repo like Dtest. 
> 
> This would imply a looser coupling between the two. Keeping things in-tree is 
> my preferred approach. It makes testing, dependency management and code 
> sharing easier. 
> 
>> 2. It should have its own release process. 
> 
> This means now there would be two releases that need to be managed and 
> coordinated. 
> 
>> 3. It should have integration tests for different versions of Cassandra it 
>> will support. 
> 
> Given the lack of test infrastructure - this will be hard especially if you 
> have to qualify a matrix of builds. 
> 
> Dinesh 
> - 
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org 
> For additional commands, e-mail: dev-h...@cassandra.apache.org 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Side Car New Repo vs not

2018-08-21 Thread Jeremiah D Jordan
I think the following is a very big plus of it being in tree:
>> * Faster iteration speed in general. For example when we need to add a
>> new
>> JMX endpoint that the sidecar needs, or change something from JMX to a
>> virtual table (e.g. for repair, or monitoring) we can do all changes
>> including tests as one commit within the main repository and don't have
>> to
>> commit to main repo, sidecar repo, 

I also don’t see a reason why the sidecar being in tree means it would not work 
in a mixed version cluster.  The nodes themselves must work in a mixed version 
cluster during a rolling upgrade, I would expect any management side car to 
operate in the same manor, in tree or not.

This tool will be pretty tightly coupled with the server, and as someone with 
experience developing such tightly coupled tools, it is *much* easier to make 
sure you don’t accidentally break them if they are in tree.  How many times has 
someone updated some JMX interface, updated nodetool, and then moved on?  
Breaking all the external tools not in tree, without realizing it.  The above 
point about being able to modify interfaces and the side car in the same commit 
is huge in terms of making sure someone doesn’t inadvertently break the side 
car while fixing something else.

-Jeremiah


> On Aug 21, 2018, at 10:28 AM, Jonathan Haddad  wrote:
> 
> Strongly agree with Blake.  In my mind supporting multiple versions is
> mandatory.  As I've stated before, we already do it with Reaper, I'd
> consider it a major misstep if we couldn't support multiple with the
> project - provided admin tool.  It's the same reason dtests are separate -
> they work with multiple versions.
> 
> The number of repos does not affect distribution - if we want to ship
> Cassandra with the admin / repair tool (we should, imo), that can be part
> of the build process.
> 
> 
> 
> 
> On Mon, Aug 20, 2018 at 9:21 PM Blake Eggleston 
> wrote:
> 
>> If the sidecar is going to be on a different release cadence, or support
>> interacting with mixed mode clusters, then it should definitely be in a
>> separate repo. I don’t even know how branching and merging would work in a
>> repo that supports 2 separate release targets and/or mixed mode
>> compatibility, but I’m pretty sure it would be a mess.
>> 
>> As a cluster management tool, mixed mode is probably going to be a goal at
>> some point. As a new project, it will benefit from not being tied to the C*
>> release cycle (which would probably delay any sidecar release until
>> whenever 4.1 is cut).
>> 
>> 
>> On August 20, 2018 at 3:22:54 PM, Joseph Lynch (joe.e.ly...@gmail.com)
>> wrote:
>> 
>> I think that the pros of incubating the sidecar in tree as a tool first
>> outweigh the alternatives at this point of time. Rough tradeoffs that I
>> see:
>> 
>> Unique pros of in tree sidecar:
>> * Faster iteration speed in general. For example when we need to add a
>> new
>> JMX endpoint that the sidecar needs, or change something from JMX to a
>> virtual table (e.g. for repair, or monitoring) we can do all changes
>> including tests as one commit within the main repository and don't have
>> to
>> commit to main repo, sidecar repo, and dtest repo (juggling version
>> compatibility along the way).
>> * We can in the future more easily move serious background functionality
>> like compaction or repair itself (not repair scheduling, actual
>> repairing)
>> into the sidecar with a single atomic commit, we don't have to do two
>> phase
>> commits where we add some IPC mechanism to allow us to support it in
>> both,
>> then turn it on in the sidecar, then turn it off in the server, etc...
>> * I think that the verification is much easier (sounds like Jonathan
>> disagreed on the other thread, I could certainly be wrong), and we don't
>> have to worry about testing matrices to assure that the sidecar works
>> with
>> various versions as the version of the sidecar that is released with that
>> version of Cassandra is the only one we have to certify works. If people
>> want to pull in new versions or maintain backports they can do that at
>> their discretion/testing.
>> * We can iterate and prove value before committing to a choice. Since it
>> will be a separate artifact from the start we can always move the
>> artifact
>> to a separate repo later (but moving the other way is harder).
>> * Users will get the sidecar "for free" when they install the daemon,
>> they
>> don't need to take affirmative action to e.g. be able to restart their
>> cluster, run repair, or back their data up; it just comes out of the box
>> for free.
>> 
>> Unique pros of a separate repository sidecar:
>> * We can use a more modern build system like gradle instead of ant
>> * Merging changes is less "scary" I guess (I feel like if you're not
>> touching the daemon this is already true but I could see this being less
>> worrisome for some).
>> * Releasing a separate artifact is somewhat easier from a separate repo
>> (especially if we have gradle which m

Re: UDF

2018-09-11 Thread Jeremiah D Jordan
Be careful when pulling in source files from the DataStax Java Driver (or 
anywhere) to make sure and respect its Apache License, Version 2.0 and keep all 
Copyright's etc with said files.

-Jeremiah

> On Sep 11, 2018, at 12:29 PM, Jeff Jirsa  wrote:
> 
> +1 as well.
> 
> On Tue, Sep 11, 2018 at 10:27 AM Aleksey Yeschenko 
> wrote:
> 
>> If this is about inclusion in 4.0, then I support it.
>> 
>> Technically this is *mostly* just a move+donation of some code from
>> java-driver to Cassandra. Given how important this seemingly is to the
>> board and PMC for us to not have the dependency on the driver, the sooner
>> it’s gone, the better.
>> 
>> I’d be +1 for committing to trunk.
>> 
>> —
>> AY
>> 
>> On 11 September 2018 at 14:43:29, Robert Stupp (sn...@snazy.de) wrote:
>> 
>> The patch is technically complete - i.e. it works and does its thing.
>> 
>> It's not strictly a bug fix but targets trunk. That's why I started the
>> discussion.
>> 
>> 
>> On 09/11/2018 02:53 PM, Jason Brown wrote:
>>> Hi Robert,
>>> 
>>> Thanks for taking on this work. Is this message a heads up that a patch
>> is
>>> coming/complete, or to spawn a discussion about including this in 4.0?
>>> 
>>> Thanks,
>>> 
>>> -Jason
>>> 
>>> On Tue, Sep 11, 2018 at 2:32 AM, Robert Stupp  wrote:
>>> 
 In an effort to clean up our hygiene and limit the dependencies used
>> by
 UDFs/UDAs, I think we should refactor the UDF code parts and remove
>> the
 dependency to the Java Driver in that area without breaking existing
 UDFs/UDAs.
 
 A working prototype is in this branch: 
 https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_snazy_&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=Gesm79MRSHznQEKqQabvh3Ie1L3xzqlPsfLfEfadHTM&s=6lUpmmETCKbmt_zcp_DCLIxCGPjVyf7zdX0UjBVOZX4&e=
 cassandra/tree/feature/remove-udf-driver-dep-trunk <
 https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_snazy_cassandra_tree_feature_remove-2D&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=Gesm79MRSHznQEKqQabvh3Ie1L3xzqlPsfLfEfadHTM&s=fBx64l59d8Y9Q7m9j0nNH9VvcaHc3QfoCAx4st5UJDM&e=
 udf-driver-dep-trunk> . The changes are rather trivial and provide
>> 100%
 backwards compatibility for existing UDFs.
 
 The prototype copies the necessary parts from the Java Driver into the
>> C*
 source tree to org.apache.cassandra.cql3.functions.types and adopts
>> its
 usages - i.e. UDF/UDA code plus CQLSSTableWriter +
>> StressCQLSSTableWriter.
 The latter two classes have a reference to UDF’s UDHelper and had to
>> be
 changed as well.
 
 Some functionality, like type parsing & handling, is duplicated in the
 code base with this prototype - once in the “current” source tree and
>> once
 for UDFs. However, unifying the code paths is not trivial, since the
>> UDF
 sandbox prohibits the use of internal classes (direct and likely
>> indirect
 dependencies).
 
 Robert
 
 —
 Robert Stupp
 @snazy
 
 
>> 
>> --
>> Robert Stupp
>> @snazy
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Accept GoCQL driver donation and begin incubation process

2018-09-12 Thread Jeremiah D Jordan
+1

But I also think getting this through incubation might take a while/be 
impossible given how large the contributor list looks…

> On Sep 12, 2018, at 10:22 AM, Jeff Jirsa  wrote:
> 
> +1
> 
> (Incubation looks like it may be challenging to get acceptance from all 
> existing contributors, though)
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Sep 12, 2018, at 8:12 AM, Nate McCall  wrote:
>> 
>> This will be the same process used for dtest. We will need to walk
>> this through the incubator per the process outlined here:
>> 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__incubator.apache.org_guides_ip-5Fclearance.html&d=DwIFAg&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=g-MlYFZVJ7j5Dj_ZfPfa0Ik8Nxco7QsJhTG1TnJH7xI&s=rk5T_t1HZY6PAhN5XgflBhfEtNrcZkVTIvQxixDlw9o&e=
>> 
>> Pending the outcome of this vote, we will create the JIRA issues for
>> tracking and after we go through the process, and discuss adding
>> committers in a separate thread (we need to do this atomically anyway
>> per general ASF committer adding processes).
>> 
>> Thanks,
>> -Nate
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Accept GoCQL driver donation and begin incubation process

2018-09-12 Thread Jeremiah D Jordan
No, doesn’t change it.  Any code donation has to go through the incubation 
process, which is where all the legal stuff about it being donated is handled.  
This would be like the dtest repo which was donated a little while back, and 
followed this same process.

-Jeremiah

> On Sep 12, 2018, at 3:05 PM, kurt greaves  wrote:
> 
> In the previous thread we seemed to come to the conclusion it would be
> under the same project with same committers/pmc. I don't know if sending it
> through incubation changes that?
> 
> On Wed., 12 Sep. 2018, 13:03 Jeremy Hanna, 
> wrote:
> 
>> I don’t know if others have this same question, but what does accepting
>> the gocql driver donation mean?  It becomes a full Apache project separate
>> from Cassandra and there exists a separate set of PMC members and such?  Or
>> does it become part of the Cassandra project itself?  From Sylvain and
>> Jon’s responses, it seems like it’s the latter.  I have some memories of
>> the Apache Extras and some things that lived in there for a time and those
>> were eventually phased out so I didn’t know if that applied to this
>> discussion as well.
>> 
>>> On Sep 12, 2018, at 10:12 AM, Nate McCall  wrote:
>>> 
>>> This will be the same process used for dtest. We will need to walk
>>> this through the incubator per the process outlined here:
>>> 
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__incubator.apache.org_guides_ip-5Fclearance.html&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=4y7Q74EaQxyVFLgnLFlWg-O71SM3eibG4Z0uQTS2Pr4&s=iDEkpoH0oy1goQ_ohPa05DdMdH_zbEztl_-EfMzs-Gc&e=
>>> 
>>> Pending the outcome of this vote, we will create the JIRA issues for
>>> tracking and after we go through the process, and discuss adding
>>> committers in a separate thread (we need to do this atomically anyway
>>> per general ASF committer adding processes).
>>> 
>>> Thanks,
>>> -Nate
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Accept GoCQL driver donation and begin incubation process

2018-09-12 Thread Jeremiah D Jordan
My +1 was under the assumption that the current maintainer wanted to continue 
with the project and would be brought in to do that along with the code.

But my +1 also has no meaning in the ASF meritocracy as I am neither a 
committer or a PMC member ;).  It just expresses my opinion as a long standing 
member of the community that I think it is fine to bring the current GoCQL 
driver and its main contributors into the Apache Cassandra project.  I was not 
asking others to do the work, I was assuming the current people doing the work 
on said driver would continue to work on it.

-Jeremiah

> On Sep 12, 2018, at 12:55 PM, Jonathan Haddad  wrote:
> 
> I'm +0, and I share the same concerns as Sylvain.
> 
> For those of you that have +1'ed, are you planning on contributing to the
> driver?  Docs, code, QA?  It's easy to throw a +1 down to make the driver
> the responsibility of the project if you're asking others to do the work.
> I vote this way because I already know I won't contribute to it and it's
> irresponsible to vote something in if I have no intent on helping maintain
> it.
> 
> On Wed, Sep 12, 2018 at 10:45 AM Sumanth Pasupuleti
>  wrote:
> 
>> +1
>> 
>> On Wed, Sep 12, 2018 at 10:37 AM Dinesh Joshi
>>  wrote:
>> 
>>> +1
>>> 
>>> Dinesh
>>> 
>>>> On Sep 12, 2018, at 10:23 AM, Jaydeep Chovatia <
>>> chovatia.jayd...@gmail.com> wrote:
>>>> 
>>>> +1
>>>> 
>>>> On Wed, Sep 12, 2018 at 10:00 AM Roopa Tangirala
>>>>  wrote:
>>>> 
>>>>> +1
>>>>> 
>>>>> 
>>>>> *Regards,*
>>>>> 
>>>>> *Roopa Tangirala*
>>>>> 
>>>>> Engineering Manager CDE
>>>>> 
>>>>> *(408) 438-3156 - mobile*
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Sep 12, 2018 at 8:51 AM Sylvain Lebresne 
>>>>> wrote:
>>>>> 
>>>>>> -0
>>>>>> 
>>>>>> The project seems to have a hard time getting on top of reviewing his
>>>>>> backlog
>>>>>> of 'patch available' issues, so that I'm skeptical adopting more code
>>> to
>>>>>> maintain is the thing the project needs the most right now. Besides,
>>> I'm
>>>>>> also
>>>>>> generally skeptical that augmenting the scope of a project makes it
>>>>> better:
>>>>>> I feel
>>>>>> keeping this project focused on the core server is better. I see
>> risks
>>>>>> here, but
>>>>>> the upsides haven't been made very clear for me, even for end users:
>>> yes,
>>>>>> it
>>>>>> may provide a tiny bit more clarity around which Golang driver to
>>> choose
>>>>> by
>>>>>> default, but I'm not sure users are that lost, and I think there is
>>> other
>>>>>> ways to
>>>>>> solve that if we really want.
>>>>>> 
>>>>>> Anyway, I reckon I may be overly pessimistic here and it's not that
>>>>> strong
>>>>>> of
>>>>>> an objection if a large majority is on-board, so giving my opinion
>> but
>>>>> not
>>>>>> opposing.
>>>>>> 
>>>>>> --
>>>>>> Sylvain
>>>>>> 
>>>>>> 
>>>>>> On Wed, Sep 12, 2018 at 5:36 PM Jeremiah D Jordan <
>>>>>> jeremiah.jor...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> +1
>>>>>>> 
>>>>>>> But I also think getting this through incubation might take a
>> while/be
>>>>>>> impossible given how large the contributor list looks…
>>>>>>> 
>>>>>>>> On Sep 12, 2018, at 10:22 AM, Jeff Jirsa  wrote:
>>>>>>>> 
>>>>>>>> +1
>>>>>>>> 
>>>>>>>> (Incubation looks like it may be challenging to get acceptance from
>>>>> all
>>>>>>> existing contributors, though)
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jeff Jirsa
>>>>>>>> 
>>>>>>>

Re: Moving tickets out of 4.0 post freeze

2018-09-24 Thread Jeremiah D Jordan
So as to not confuse people, even if we never put out a 4.1, I think we should 
keep it 4.0.x, in line with 2.2.x, 3.0.x, 3.11.x.  And yes our .. versioning of the past never followed semver.

-Jeremiah

> On Sep 24, 2018, at 11:45 AM, Benedict Elliott Smith  
> wrote:
> 
> I’d like to propose we don’t do Semver.  Back when we did this before, there 
> wasn’t any clear distinction between a major and a minor release.  They were 
> both infrequent, both big, and were treated as majors for EOL'ing support for 
> older releases.  This must surely have been confusing for users, and I’m not 
> sure what we got from it?
> 
> Why don’t we keep it simple, and just have major.patch?  So we would release 
> simply ‘4’ now, and the next feature release would be ‘5'.
> 
> 
> 
> 
>> On 24 Sep 2018, at 17:34, Michael Shuler  wrote:
>> 
>> On 9/24/18 7:09 AM, Joshua McKenzie wrote:
>>> I propose we move all new features and improvements to 4.0.x to keep the
>>> surface area of change for the major stable.
>> 
>> It occurs to me that we should probably update the version in trunk to
>> 4.0.0, if we're following semantic versions. I suppose this also means
>> all the tickets for 4.x should be updated to 4.0.x, 4.0 to 4.0.0, etc.
>> 
>> -- 
>> Michael
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Deprecating/removing PropertyFileSnitch?

2018-10-16 Thread Jeremiah D Jordan
As long as we are correctly storing such things in the system tables and 
reading them out of the system tables when we do not have the information from 
gossip yet, it should not be a problem. (As far as I know GPFS does this, but I 
have not done extensive code diving or testing to make sure all edge cases are 
covered there)

-Jeremiah

> On Oct 16, 2018, at 11:56 AM, sankalp kohli  wrote:
> 
> Will GossipingPropertyFileSnitch not be vulnerable to Gossip bugs where we
> lose hostId or some other fields when we restart C* for large
> clusters(~1000 instances)?
> 
> On Tue, Oct 16, 2018 at 7:59 AM Jeff Jirsa  wrote:
> 
>> We should, but the 4.0 features that log/reject verbs to invalid replicas
>> solves a lot of the concerns here
>> 
>> --
>> Jeff Jirsa
>> 
>> 
>>> On Oct 16, 2018, at 4:10 PM, Jeremy Hanna 
>> wrote:
>>> 
>>> We have had PropertyFileSnitch for a long time even though
>> GossipingPropertyFileSnitch is effectively a superset of what it offers and
>> is much less error prone.  There are some unexpected behaviors when things
>> aren’t configured correctly with PFS.  For example, if you replace nodes in
>> one DC and add those nodes to that DCs property files and not the other DCs
>> property files - the resulting problems aren’t very straightforward to
>> troubleshoot.
>>> 
>>> We could try to improve the resilience and fail fast error checking and
>> error reporting of PFS, but honestly, why wouldn’t we deprecate and remove
>> PropertyFileSnitch?  Are there reasons why GPFS wouldn’t be sufficient to
>> replace it?
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Deprecating/removing PropertyFileSnitch?

2018-10-22 Thread Jeremiah D Jordan
S is the ability to easily package it
>>>>> across multiple nodes, as pointed out by Sean Durity on CASSANDRA-10745
>>>>> (which is also it's Achilles' heel). To keep this ability, we could
>> make
>>>>> GPFS compatible with the cassandra-topology.properties file, but
>> reading
>>>>> only the dc/rack info about the local node.
>>>>> 
>>>>> Em seg, 22 de out de 2018 às 16:58, sankalp kohli <
>> kohlisank...@gmail.com>
>>>>> escreveu:
>>>>> 
>>>>>> Yes it will happen. I am worried that same way DC or rack info can go
>>>>>> missing.
>>>>>> 
>>>>>> On Mon, Oct 22, 2018 at 12:52 PM Paulo Motta <
>> pauloricard...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>>> the new host won’t learn about the host whose status is missing and
>>>>> the
>>>>>>> view of this host will be wrong.
>>>>>>> 
>>>>>>> Won't this happen even with PropertyFileSnitch as the token(s) for
>> this
>>>>>>> host will be missing from gossip/system.peers?
>>>>>>> 
>>>>>>> Em sáb, 20 de out de 2018 às 00:34, Sankalp Kohli <
>>>>>> kohlisank...@gmail.com>
>>>>>>> escreveu:
>>>>>>> 
>>>>>>>> Say you restarted all instances in the cluster and status for some
>>>>> host
>>>>>>>> goes missing. Now when you start a host replacement, the new host
>>>>> won’t
>>>>>>>> learn about the host whose status is missing and the view of this
>>>>> host
>>>>>>> will
>>>>>>>> be wrong.
>>>>>>>> 
>>>>>>>> PS: I will be happy to be proved wrong as I can also start using
>>>>> Gossip
>>>>>>>> snitch :)
>>>>>>>> 
>>>>>>>>> On Oct 19, 2018, at 2:41 PM, Jeremy Hanna <
>>>>>> jeremy.hanna1...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Do you mean to say that during host replacement there may be a time
>>>>>>> when
>>>>>>>> the old->new host isn’t fully propagated and therefore wouldn’t yet
>>>>> be
>>>>>> in
>>>>>>>> all system tables?
>>>>>>>>> 
>>>>>>>>>> On Oct 17, 2018, at 4:20 PM, sankalp kohli <
>>>>> kohlisank...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> This is not the case during host replacement correct?
>>>>>>>>>> 
>>>>>>>>>> On Tue, Oct 16, 2018 at 10:04 AM Jeremiah D Jordan <
>>>>>>>>>> jeremiah.jor...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> As long as we are correctly storing such things in the system
>>>>>> tables
>>>>>>>> and
>>>>>>>>>>> reading them out of the system tables when we do not have the
>>>>>>>> information
>>>>>>>>>>> from gossip yet, it should not be a problem. (As far as I know
>>>>> GPFS
>>>>>>>> does
>>>>>>>>>>> this, but I have not done extensive code diving or testing to
>>>>> make
>>>>>>>> sure all
>>>>>>>>>>> edge cases are covered there)
>>>>>>>>>>> 
>>>>>>>>>>> -Jeremiah
>>>>>>>>>>> 
>>>>>>>>>>>> On Oct 16, 2018, at 11:56 AM, sankalp kohli <
>>>>>> kohlisank...@gmail.com
>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Will GossipingPropertyFileSnitch not be vulnerable to Gossip
>>>>> bugs
>>>>>>>> where
>>>>>>>>>>> we
>>>>>>>>>>>> lose hostId or some other fields when we restart C* for large
>>>>>>>>>>>>

Re: Deprecating/removing PropertyFileSnitch?

2018-10-22 Thread Jeremiah D Jordan
If you guys are still seeing the problem, would be good to have a JIRA written 
up, as all the ones linked were fixed in 2017 and 2015.  CASSANDRA-13700 was 
found during our testing, and we haven’t seen any other issues since fixing it.

-Jeremiah

> On Oct 22, 2018, at 10:12 PM, Sankalp Kohli  wrote:
> 
> No worries...I mentioned the issue not the JIRA number 
> 
>> On Oct 22, 2018, at 8:01 PM, Jeremiah D Jordan  wrote:
>> 
>> Sorry, maybe my spam filter got them or something, but I have never seen a 
>> JIRA number mentioned in the thread before this one.  Just looked back 
>> through again to make sure, and this is the first email I have with one.
>> 
>> -Jeremiah
>> 
>>> On Oct 22, 2018, at 9:37 PM, sankalp kohli  wrote:
>>> 
>>> Here are some of the JIRAs which are fixed but actually did not fix the
>>> issue. We have tried fixing this by several patches. May be it will be
>>> fixed when Gossip is rewritten(CASSANDRA-12345). I should find or create a
>>> new JIRA as this issue still exists.
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D10366&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=lI3KEen0YYUim6t3VWsvITHUZfFX8oYaczP_t3kk21o&s=W_HfejhgW1gmZ06L0CXOnp_EgBQ1oI5MLMoyz0OrvFw&e=
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_CASSANDRA-2D10089&d=DwIFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=CNZK3RiJDLqhsZDG6FQGnXn8WyPRCQhp4x_uBICNC0g&m=lI3KEen0YYUim6t3VWsvITHUZfFX8oYaczP_t3kk21o&s=qXzh1nq2yE27J8SvwYoRf9HPQE83m07cKdKVHXyOyAE&e=
>>>  (related to it)
>>> 
>>> Also the quote you are using was written as a follow on email. I have
>>> already said what the bug I was referring to.
>>> 
>>> "Say you restarted all instances in the cluster and status for some host
>>> goes missing. Now when you start a host replacement, the new host won’t
>>> learn about the host whose status is missing and the view of this host will
>>> be wrong."
>>> 
>>> - CASSANDRA-10366
>>> 
>>> 
>>> On Mon, Oct 22, 2018 at 7:22 PM Sankalp Kohli 
>>> wrote:
>>> 
>>>> I will send the JIRAs of the bug which we thought we have fixed but it
>>>> still exists.
>>>> 
>>>> Have you done any correctness testing after doing all these tests...have
>>>> you done the tests for 1000 instance clusters?
>>>> 
>>>> It is great you have done these tests and I am hoping the gossiping snitch
>>>> is good. Also was there any Gossip bug fixed post 3.0? May be I am seeing
>>>> the bug which is fixed.
>>>> 
>>>>> On Oct 22, 2018, at 7:09 PM, J. D. Jordan 
>>>> wrote:
>>>>> 
>>>>> Do you have a specific gossip bug that you have seen recently which
>>>> caused a problem that would make this happen?  Do you have a specific JIRA
>>>> in mind?  “We can’t remove this because what if there is a bug” doesn’t
>>>> seem like a good enough reason to me. If that was a reason we would never
>>>> make any changes to anything.
>>>>> I think many people have seen PFS actually cause real problems, where
>>>> with GPFS the issue being talked about is predicated on some theoretical
>>>> gossip bug happening.
>>>>> In the past year at DataStax we have done a lot of testing on 3.0 and
>>>> 3.11 around adding nodes, adding DC’s, replacing nodes, replacing racks,
>>>> and replacing DC’s, all while using GPFS, and as far as I know we have not
>>>> seen any “lost” rack/DC information during such testing.
>>>>> 
>>>>> -Jeremiah
>>>>> 
>>>>>> On Oct 22, 2018, at 5:46 PM, sankalp kohli 
>>>> wrote:
>>>>>> 
>>>>>> We will have similar issues with Gossip but this will create more
>>>> issues as
>>>>>> more things will be relied on Gossip.
>>>>>> 
>>>>>> I agree PFS should be removed but I dont see how it can be with issues
>>>> like
>>>>>> these or someone proves that it wont cause any issues.
>>>>>> 
>>>>>> On Mon, Oct 22, 2018 at 2:21 PM Paulo Motta 
>>>>>> wrote:
>>>>>> 
>>>>>>> I can understand keeping PFS for historical/compatibility reasons, but
>>>> if
>>>>>>> gossip is broken I think you will have simila

Re: Audit logging to tables.

2019-03-01 Thread Jeremiah D Jordan
AFAIK the Full Query Logging binary format was already made more general in 
order to support using that format for the audit logging.

-Jeremiah

> On Mar 1, 2019, at 11:38 AM, Joshua McKenzie  wrote:
> 
> Is there a world in which a general purpose, side-channel file storage
> format for transient things like this (hints, batches, audit logs, etc)
> could be useful as a first class citizen in the codebase? i.e. a world in
> which we refactored some of the hints-specific reader/writer code to be
> used for things like this if/when they come up?
> 
> On Thu, Feb 28, 2019 at 12:04 PM Jonathan Haddad  > wrote:
> 
>> Agreed with Dinesh and Josh.  I would *never* put the audit log back in
>> Cassandra.
>> 
>> This is extendable, Sagar, so you're free to do as you want, but I'm very
>> opposed to putting a ticking time bomb in Cassandra proper.
>> 
>> Jon
>> 
>> 
>> On Thu, Feb 28, 2019 at 8:38 AM Dinesh Joshi 
>> wrote:
>> 
>>> I strongly echo Josh’s sentiment. Imagine losing audit entries because C*
>>> is overloaded? It’s fine if you don’t care about losing audit entries.
>>> 
>>> Dinesh
>>> 
 On Feb 28, 2019, at 6:41 AM, Joshua McKenzie 
>>> wrote:
 
 One of the things we've run into historically, on a *lot* of axes, is
>>> that
 "just put it in C*" for various functionality looks great from a user
>> and
 usability perspective, and proves to be something of a nightmare from
>> an
 admin / cluster behavior perspective.
 
 i.e. - cluster suffering so you're writing hints? Write them to C*
>> tables
 and watch the cluster suffer more! :)
 Same thing probably holds true for audit logging - at a time frame when
 things are getting hairy w/a cluster, if you're writing that audit
>>> logging
 into C* proper (and dealing with ser/deser, compaction pressure,
>> flushing
 pressure, etc) from that, there's a compounding effect of pressure and
>>> pain
 on the cluster.
 
 So the TL;DR we as a project kind of philosophically have been moving
 towards (I think that's valid to say?) is: use C* for the things it's
 absolutely great at, and try to side-channel other recovery operations
>> as
 much as you can (see: file-based hints) to stay out of its way.
 
 Same thing held true w/design of CDC - I debated "materialize in memory
>>> for
 consumer to take over socket", and "keep the data in another C* table",
>>> but
 the ramifications to perf and core I/O operations in C* the moment
>> things
 start to go badly were significant enough that the route we went was
>> "do
>>> no
 harm". For better or for worse, as there's obvious tradeoffs there.
 
> On Thu, Feb 28, 2019 at 7:46 AM Sagar 
>>> wrote:
> 
> Thanks all for the pointers.
> 
> @Joseph,
> 
> I have gone through the links shared by you. Also, I have been looking
>>> at
> the code base.
> 
> I understand the fact that pushing the logs to ES or Solr is a lot
>>> easier
> to do. Having said that, the only reason I thought having something
>> like
> this might help is, if I don't want to add more pieces and still
>>> provide a
> central piece of audit logging within Cassandra itself and still be
> queryable.
> 
> In terms of usages, one of them could definitely be CDC related use
>>> cases.
> With data being stored in tables and being queryable, it can become a
>>> lot
> more easier to expose this data to external systems like Kafka
>> Connect,
> Debezium which have the ability to push data to Kafka for example.
>> Note
> that pushing data to Kafka is just an example, but what I mean is, if
>> we
> can have data in tables, then instead of everyone writing custom
>> custom
> loggers, they can hook into this table info and take action.
> 
> Regarding the infinite loop question, I have done some analysis, and
>> in
>>> my
> opinion, instead of tweaking the behaviour of Binlog and the way it
> functions currently, we can actually spin up another tailer thread to
>>> the
> same Chronicle Queue which can do the needful. This way the config
>>> options
> etc all remain the same(apart from the logger ofcourse).
> 
> Let me know if any of it makes sense :D
> 
> Thanks!
> Sagar.
> 
> 
> On Thu, Feb 28, 2019 at 1:09 AM Dinesh Joshi
>> >>> 
> wrote:
> 
>> 
>> 
>>> On Feb 27, 2019, at 10:41 AM, Joseph Lynch 
>> wrote:
>>> 
>>> Vinay can confirm, but as far as I am aware we have no current plans
>>> to
>>> implement audit logging to a table directly, but the implementation
>> is
>>> fully pluggable (like compaction, compression, etc ...). Check out
>> the
>> blog
>>> post [1] and documentation [2] Vinay wrote for more details, but the
>> short
>> 
>> +1. I am still curious as to why you'd want to store audit log
>> entries
>> back in Cassandra? Dependin

Re: [Discuss] num_tokens default in Cassandra 4.0

2020-02-18 Thread Jeremiah D Jordan
+1 for 8 + algorithm assignment being the default.

Why do we have to assume random assignment?  If someone turns off algorithm 
assignment they are changing away from defaults, so they should also adjust the 
num tokens.

-Jeremiah

> On Feb 18, 2020, at 1:44 AM, Mick Semb Wever  wrote:
> 
> -1
> 
> Discussions here and on slack have brought up a number of important
> concerns. I think those concerns need to be summarised here before any
> informal vote.
> 
> It was my understanding that some of those concerns may even be blockers to
> a move to 16. That is we have to presume the worse case scenario where all
> tokens get randomly generated.
> 
> Can we ask for some analysis and data against the risks different
> num_tokens choices present. We shouldn't rush into a new default, and such
> background information and data is operator value added. Maybe I missed any
> info/experiments that have happened?
> 
> 
> 
> On Mon., 17 Feb. 2020, 11:14 pm Jeremy Hanna, 
> wrote:
> 
>> I just wanted to close the loop on this if possible.  After some discussion
>> in slack about various topics, I would like to see if people are okay with
>> num_tokens=8 by default (as it's not much different operationally than
>> 16).  Joey brought up a few small changes that I can put on the ticket.  It
>> also requires some documentation for things like decommission order and
>> skew.
>> 
>> Are people okay with this change moving forward like this?  If so, I'll
>> comment on the ticket and we can move forward.
>> 
>> Thanks,
>> 
>> Jeremy
>> 
>> On Tue, Feb 4, 2020 at 8:45 AM Jon Haddad  wrote:
>> 
>>> I think it's a good idea to take a step back and get a high level view of
>>> the problem we're trying to solve.
>>> 
>>> First, high token counts result in decreased availability as each node
>> has
>>> data overlap with with more nodes in the cluster.  Specifically, a node
>> can
>>> share data with RF-1 * 2 * num_tokens.  So a 256 token cluster at RF=3 is
>>> going to almost always share data with every other node in the cluster
>> that
>>> isn't in the same rack, unless you're doing something wild like using
>> more
>>> than a thousand nodes in a cluster.  We advertise
>>> 
>>> With 16 tokens, that is vastly improved, but you still have up to 64
>> nodes
>>> each node needs to query against, so you're again, hitting every node
>>> unless you go above ~96 nodes in the cluster (assuming 3 racks / AZs).  I
>>> wouldn't use 16 here, and I doubt any of you would either.  I've
>> advocated
>>> for 4 tokens because you'd have overlap with only 16 nodes, which works
>>> well for small clusters as well as large.  Assuming I was creating a new
>>> cluster for myself (in a hypothetical brand new application I'm
>> building) I
>>> would put this in production.  I have worked with several teams where I
>>> helped them put 4 token clusters in prod and it has worked very well.  We
>>> didn't see any wild imbalance issues.
>>> 
>>> As Mick's pointed out, our current method of using random token
>> assignment
>>> for the default number of problematic for 4 tokens.  I fully agree with
>>> this, and I think if we were to try to use 4 tokens, we'd want to address
>>> this in tandem.  We can discuss how to better allocate tokens by default
>>> (something more predictable than random), but I'd like to avoid the
>>> specifics of that for the sake of this email.
>>> 
>>> To Alex's point, repairs are problematic with lower token counts due to
>>> over streaming.  I think this is a pretty serious issue and I we'd have
>> to
>>> address it before going all the way down to 4.  This, in my opinion, is a
>>> more complex problem to solve and I think trying to fix it here could
>> make
>>> shipping 4.0 take even longer, something none of us want.
>>> 
>>> For the sake of shipping 4.0 without adding extra overhead and time, I'm
>> ok
>>> with moving to 16 tokens, and in the process adding extensive
>> documentation
>>> outlining what we recommend for production use.  I think we should also
>> try
>>> to figure out something better than random as the default to fix the data
>>> imbalance issues.  I've got a few ideas here I've been noodling on.
>>> 
>>> As long as folks are fine with potentially changing the default again in
>> C*
>>> 5.0 (after another discussion / debate), 16 is enough of an improvement
>>> that I'm OK with the change, and willing to author the docs to help
>> people
>>> set up their first cluster.  For folks that go into production with the
>>> defaults, we're at least not setting them up for total failure once their
>>> clusters get large like we are now.
>>> 
>>> In future versions, we'll probably want to address the issue of data
>>> imbalance by building something in that shifts individual tokens
>> around.  I
>>> don't think we should try to do this in 4.0 either.
>>> 
>>> Jon
>>> 
>>> 
>>> 
>>> On Fri, Jan 31, 2020 at 2:04 PM Jeremy Hanna >> 
>>> wrote:
>>> 
 I think Mick and Anthony make some valid operational and skew points
>> for

Re: Simplify voting rules for in-jvm-dtest-api releases

2020-04-15 Thread Jeremiah D Jordan
I think as long as we don’t publish the artifacts to maven central or some 
other location that is for “anyone” we do not need a formal release. Even then 
since the artifact is only meant for use by people developing C* that might be 
fine.

If artifacts are only for use by individuals actively participating in the 
development process, then no formal release is needed.  See the definition of 
“release” and “publication” found here:

http://www.apache.org/legal/release-policy.html#release-definition
> DEFINITION OF "RELEASE" 
> 
> Generically, a release is anything that is published beyond the group that 
> owns it. For an Apache project, that means any publication outside the 
> development community, defined as individuals actively participating in 
> development or following the dev list.
> 
> More narrowly, an official Apache release is one which has been endorsed as 
> an "act of the Foundation" by a PMC.
> 
> 

> PUBLICATION 
> Projects SHALL publish official releases and SHALL NOT publish unreleased 
> materials outside the development community.
> 
> During the process of developing software and preparing a release, various 
> packages are made available to the development community for testing 
> purposes. Projects MUST direct outsiders towards official releases rather 
> than raw source repositories, nightly builds, snapshots, release candidates, 
> or any other similar packages. The only people who are supposed to know about 
> such developer resources are individuals actively participating in 
> development or following the dev list and thus aware of the conditions placed 
> on unreleased materials.
> 


-Jeremiah

> On Apr 15, 2020, at 3:05 PM, Nate McCall  wrote:
> 
> Open an issue with the LEGAL jira project and ask there.
> 
> I'm like 62% sure they will say nope. The vote process and the time for
> such is to allow for PMC to review the release to give the ASF a reasonable
> degree of assurance for indemnification. However, we might have a fair
> degree of leeway so long as we do 'vote', it's test scope (as Mick pointed
> out) and the process for such is published somewhere?
> 
> Cheers,
> -Nate
> 
> On Thu, Apr 16, 2020 at 7:20 AM Oleksandr Petrov 
> wrote:
> 
>> The most important thing for the purposes of what we’re trying to achieve
>> is to have a unique non overridable version. In principle, a unique tag
>> with release timestamp should be enough, as long as we can uniquely
>> reference it.
>> 
>> However, even then, I’d say release frequency (establishing “base”) for
>> releases should be at least slightly relaxed compared to Cassandra itself.
>> 
>> I will investigate whether it is possible to avoid voting for test only
>> dependencies, since I’d much rather have it under Apache umbrella, as I was
>> previously skeptical of a dependency that I believed shouldn’t have been
>> locked under contributor’s GitHub.
>> 
>> If test only no-vote option isn’t possible due to legal reasons, we can
>> proceed with snapshot+timestamp and release-rebase with a 24h simplified
>> vote.
>> 
>> Thanks,
>> —Alex
>> 
>> On Wed 15. Apr 2020 at 19:24, Mick Semb Wever  wrote:
>> 
 Apache release rules were made for first-class projects. I would like
>> to
 propose simplifying voting rules for in-jvm-dtest-api project [1].
>>> 
>>> 
>>> I am not sure the PMC can simply vote away the ASF release rules here.
>>> But it should be possible to implement the proposal by stepping away
>>> from what the ASF considers a release and work with "nightlies" or
>>> snapshots. The purpose of an "ASF release" has little value to
>>> in-jvm-dtest-api, IIUC.
>>> 
>>> For example you can't put artifacts into a public maven repository
>>> without a formal release vote. AFAIK the vote process is there for the
>>> sake of the legal protections the ASF extends to all its projects,
>>> over any notion of technical quality of the release cut.
>>> 
>>> And we are not supposed to be including binaries in the source code
>>> artifacts, at least not for anything that runtime code depends on.
>>> 
>>> Solutions to this could be…
>>> - allowing snapshot versions of test scope dependencies*, downloaded
>>> from the ASF's snapshot repository⁽¹⁾
>>> - making an exception for a binary if only used in test scope (i
>>> believe this is ok),
>>> - move in-jvm-dtest-api out of ASF (just have it as a github project,
>>> and publish as needed to a maven repo)
>>> 
>>> You could also keep using `mvn release:prepare` to cut the versions,
>>> but just not deploy them to ASF's distribution channels.
>>> 
>>> This whole area of ASF procedures is quite extensive, so i'd
>>> definitely appreciate being correctly contradicted :-)
>>> 
>>> My vote would be to take the approach of using the snapshot
>>> repository. Semantic versioning has limited value here, and you would
>>> be able to have a jenkins build push the lat

Re: [DISCUSS] CASSANDRA-13994

2020-05-27 Thread Jeremiah D Jordan
+1 strongly agree.  If we aren’t going to let something go into 4.0.0 because 
it would "invalidate testing” then we can not let such a thing go into 4.0.1 
unless we plan to re-do said testing for the patch release.

> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith  
> wrote:
> 
> I'm being told this still isn't clear, so let me try in a bullet-point 
> timeline:
> 
> * 4.0 Beta
> * 4.0 Verification Work
> * [Merge Window]
> * 4.0 GA
> * 4.0 Minor Releases 
> * ...
> * 5.0 Dev
> * ...
> * 5.0 Verification Work 
> * GA 5.0
> 
> I think that anything that is prohibited from "[Merge Window]" because it 
> invalidates "4.0 Verification Work" must also be prohibited until "5.0 Dev" 
> because the next equivalent work that can now validate it occurs only at "5.0 
> Verification Work"
> 
> On 27/05/2020, 19:05, "Benedict Elliott Smith"  wrote:
> 
>I'm not sure if I communicated my point very well.  I mean to say that if 
> the reason we are prohibiting a patch to land post-beta is because it 
> invalidates work we only perform pre-ga, then it probably should not be 
> permitted to land post-ga either, since it must also invalidate the same work?
> 
>That is to say, if we're comfortable with work landing post-ga because we 
> believe it to be safe to release without our pre-major-release verification, 
> we should be comfortable with it landing at any time pre-ga too.  Anything 
> else seems inconsistent to me, and we should examine what assumptions we're 
> making that permit this inconsistency to arise.
> 
> 
>On 27/05/2020, 18:49, "Joshua McKenzie"  wrote:
> 
>> 
>> because it invalidates our pre-release verification, then it should not
>> land
> 
>until we next perform pre-release verification
> 
>At least for me there's a little softness around our collective 
> alignment
>on when pre-release verification takes place. If it's between alpha-1 
> and
>ga we don't want changes that would invalidate those changes to land 
> during
>that time frame. Different for beta-1 to ga. We also risk invalidating
>testing if we do any of that testing before wherever that cutoff is, 
> and a
>lack of clarity on that cutoff further muddies those waters.
> 
>My very loosely held perspective is that beta-1 to ga is the window in
>which we apply the "don't do things that will invalidate 
> verification", and
>we plan to do that verification during the beta phase. I *think* this 
> is
>consistent w/the current framing of the lifecycle doc. That being 
> said, I
>don't have strong religion on this so if we collectively want to call 
> it
>"don't majorly disrupt from alpha-1 to ga", we can formalize that in 
> the
>docs and go ahead and triage current open scope for 4.0 and move 
> things out.
> 
> 
> 
>On Wed, May 27, 2020 at 12:59 PM Ekaterina Dimitrova <
>ekaterina.dimitr...@datastax.com> wrote:
> 
>> Thank you all for your input.
>> I think an important topic is again to revise the lifecycle and ensure we
>> really have the vision on what is left until beta. I will start a separate
>> thread on the flaky tests situation soon.
>> 
>> For this particular ticket I see a couple of things:
>> - There are a lot of deletions of already not used code
>> - I implemented it still in alpha as per our agreement that this will give
>> us enough time for testing. Probably Dinesh as a reviewer can give some
>> valuable feedback/opinion on the patch.
>> - It definitely touches around important places but the important thing is
>> to see how exactly it touches, I think
>> - Considering it for alpha before the major testing in beta sounds
>> reasonable to me but I guess it also depends on people availability to
>> review it in detail and the exact test plans afterwards
>> 
>> On Wed, 27 May 2020 at 7:14, Benedict Elliott Smith 
>> wrote:
>> 
>>> I think our "pre-beta" criteria should also be our "not in a major"
>>> criteria.
>>> 
>>> If work is prohibited because it invalidates our pre-release
>> verification,
>>> then it should not land until we next perform pre-release verification,
>>> which only currently happens once per major.
>>> 
>>> This could mean either landing less in a major, or permitting more in
>> beta
>>> etc.
>>> 
>>> On 26/05/2020, 19:24, "Joshua McKenzie"  wrote:
>>> 
>>>I think an interesting question that informs when to stop accepting
>>>specific changes in a release is when we expect any extensive
>>> pre-release
>>>testing to take place.
>>> 
>>>If we go by our release lifecycle, gutting deprecated code seems
>>> compatible
>>>w/Alpha but I wouldn't endorse merging it into Beta:
>>> 
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle.
>>>Since almost all of the 40_quality_testing epic stuff is also beta
>>> phase
>>>and hasn't really taken off yet, it also seems like there will be
>>> extensive
>>>testing after this ph

Re: [DISCUSS] CASSANDRA-13994

2020-05-27 Thread Jeremiah D Jordan
> A clear point to cut RC's doesn't surface from the above for me. Releasing
> an RC before broad verification seems wrong, and cutting an RC after the 4
> points above may as well be GA because it's all known scope.

Isn’t the whole point of an RC is that it could be the GA?  It is a “release 
candidate”, meaning if no one finds any issues with it, that can them become 
the release?  So that seems like exactly the right time to make RC releases?

> On May 27, 2020, at 2:45 PM, Joshua McKenzie  wrote:
> 
> I think we're all on the same page here; I was focusing more on the release
> lifecycles and sequencing than the entire version cycle. Good to broaden
> scope I think.
> 
> One thing we're not considering is the separation of API changes from major
> changes and how that intersects with release milestones.
> 
> Meaning:
> 1. alpha phase
> 2. Milestone: API freeze (all API changes pushed to next major)
> 3. beta phase
> 4. Verification phase (all major disruptive pushed to next major)
> 
> A clear point to cut RC's doesn't surface from the above for me. Releasing
> an RC before broad verification seems wrong, and cutting an RC after the 4
> points above may as well be GA because it's all known scope.
> 
> Thoughts?
> 
> On Wed, May 27, 2020 at 3:28 PM Scott Andreas  wrote:
> 
>> That makes sense to me, yep.
>> 
>> My hope and expectation is that the time required for "verification work"
>> will shrink dramatically in the not too distant future - ideally to a
>> period of less than a month. In this world, the cost of missing one train
>> is reduced to catching the next one.
>> 
>> One of the main goals in shifting focus from "testing" and "test plans" to
>> "test engineering" is automating as many aspects of release qualification
>> as possible, with an asymptotic ideal as a function of compute capacity and
>> time. While such automation will never be complete (it's likely that
>> development of new features will/must include qualification infra changes
>> to exercise them), if we're able to apply the same rigor to major releases
>> as we are to patchlevel builds with little incremental effort, I'd be
>> thrilled.
>> 
>> This is mostly a way of saying:
>> – I like the cadence/sequencing Benedict proposes below.
>> – I think improvements in test engineering can reduce/eliminate
>> invalidation and may increase the scope of what can be a candidate for
>> merge on a given branch
>> – And if not, the cost of missing the train is lower because we'll be able
>> to deliver major releases more often.
>> 
>> Scott
>> 
>> 
>> From: Jeremiah D Jordan 
>> Sent: Wednesday, May 27, 2020 11:54 AM
>> To: Cassandra DEV
>> Subject: Re: [DISCUSS] CASSANDRA-13994
>> 
>> +1 strongly agree.  If we aren’t going to let something go into 4.0.0
>> because it would "invalidate testing” then we can not let such a thing go
>> into 4.0.1 unless we plan to re-do said testing for the patch release.
>> 
>>> On May 27, 2020, at 1:31 PM, Benedict Elliott Smith 
>> wrote:
>>> 
>>> I'm being told this still isn't clear, so let me try in a bullet-point
>> timeline:
>>> 
>>> * 4.0 Beta
>>> * 4.0 Verification Work
>>> * [Merge Window]
>>> * 4.0 GA
>>> * 4.0 Minor Releases
>>> * ...
>>> * 5.0 Dev
>>> * ...
>>> * 5.0 Verification Work
>>> * GA 5.0
>>> 
>>> I think that anything that is prohibited from "[Merge Window]" because
>> it invalidates "4.0 Verification Work" must also be prohibited until "5.0
>> Dev" because the next equivalent work that can now validate it occurs only
>> at "5.0 Verification Work"
>>> 
>>> On 27/05/2020, 19:05, "Benedict Elliott Smith" 
>> wrote:
>>> 
>>>   I'm not sure if I communicated my point very well.  I mean to say
>> that if the reason we are prohibiting a patch to land post-beta is because
>> it invalidates work we only perform pre-ga, then it probably should not be
>> permitted to land post-ga either, since it must also invalidate the same
>> work?
>>> 
>>>   That is to say, if we're comfortable with work landing post-ga
>> because we believe it to be safe to release without our pre-major-release
>> verification, we should be comfortable with it landing at any time pre-ga
>> too.  Anything else seems inconsistent to me, and we should exam

Re: [VOTE] Project governance wiki doc

2020-06-16 Thread Jeremiah D Jordan
+1 non-binding.

Thanks for the work on this!

> On Jun 16, 2020, at 11:31 AM, Jeff Jirsa  wrote:
> 
> +1 (pmc, binding)
> 
> 
> On Tue, Jun 16, 2020 at 9:19 AM Joshua McKenzie 
> wrote:
> 
>> Added unratified draft to the wiki here:
>> 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance
>> 
>> I propose the following:
>> 
>>   1. We leave the vote open for 1 week (close at end of day 6/23/20)
>>   unless there's a lot of feedback on the wiki we didn't get on gdoc
>>   2. pmc votes are considered binding
>>   3. committer and community votes are considered advisory / non-binding
>> 
>> Any objections / revisions to the above?
>> 
>> Thanks!
>> 
>> ~Josh
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Project governance wiki doc

2020-06-17 Thread Jeremiah D Jordan
I think we need to assume positive intent here.  If someone says they will 
participate then we need to assume they are true to their word.  While the 
concerns are not un-founded, I think the doc as is gives a good starting point 
for trying this out without being too complicated.  If this turns out to be a 
problem in the future we can always re-visit the governance document.

-Jeremiah

> On Jun 17, 2020, at 11:21 AM, Benedict Elliott Smith  
> wrote:
> 
> Sorry, I've been busy so not paid as close attention as I would like after 
> initial contributions to the formulation.  On the document I raised this as 
> an issue, and proposed lowering the "low watermark" to a simple majority of 
> the electorate - since if you have both a simple majority of the "active 
> electorate", and a super-majority of all voters, I think you can consider 
> that a strong consensus.
> 
> However it's worth noting that the active electorate is likely to undercount, 
> since some people won't nominate themselves in the roll call, but will still 
> vote.  So it might not in practice be a problem.  In fact it can be gamed by 
> people who want to pass a motion that fails to reach the low watermark all 
> collaborating to not count their vote at the roll call.  The only real 
> advantage of the roll call is that it's simple to administer.
> 
> On 17/06/2020, 17:12, "Jon Haddad"  wrote:
> 
>Looking at the doc again, I'm a bit concerned about this:
> 
>> PMC roll call will be taken every 6 months. This is an email to dev@
>w/the simple question to pmc members of “are you active on the project and
>plan to participate in voting over the next 6 months?”. This is strictly an
>exercise to get quorum count and in no way restricts ability to participate
>during this time window. A super-majority of this count becomes the
>low-watermark for votes in favour necessary to pass a motion, with new PMC
>members added to the calculation.
> 
>I imagine we'll see a lot of participation from folks in roll call, and
>less when it comes to votes.  It's very easy to say we'll do something,
>it's another to follow through.  A glance at any active community member's
>review board (including my own) will confirm that.
> 
>Just to provide a quick example with some rough numbers - it doesn't seem
>unreasonable to me that we'll get a roll call of 15-20 votes.  On the low
>end of that, we'd need 10 votes to pass anything and on the high end, 14.
>On the high end a vote with 13 +1 and one -1 would fail.
> 
>Just to be clear, I am 100% in favor of increased participation and a
>higher bar on voting, but I'd like to ensure we don't set the bar so high
>we can't get anything done.
> 
>Anyone else share this sentiment?
> 
>On Wed, Jun 17, 2020 at 8:37 AM David Capwell 
>wrote:
> 
>> +1 nb
>> 
>> Sent from my iPhone
>> 
>>> On Jun 17, 2020, at 7:27 AM, Andrés de la Peña 
>> wrote:
>>> 
>>> +1 nb
>>> 
 On Wed, 17 Jun 2020 at 15:06, Sylvain Lebresne 
>> wrote:
 
 +1 (binding)
 --
 Sylvain
 
 
 On Wed, Jun 17, 2020 at 1:58 PM Benjamin Lerer <
 benjamin.le...@datastax.com>
 wrote:
 
> +1 (binding)
> 
> On Wed, Jun 17, 2020 at 12:49 PM Marcus Eriksson 
> wrote:
> 
>> +1
>> 
>> 
>> On 17 June 2020 at 12:40:38, Sam Tunnicliffe (s...@beobal.com) wrote:
>>> +1 (binding)
>>> 
 On 17 Jun 2020, at 09:11, Jorge Bay Gondra wrote:
 
 +1 nb
 
 On Wed, Jun 17, 2020 at 7:41 AM Mick Semb Wever wrote:
 
> +1 (binding)
> 
> On Tue, 16 Jun 2020 at 18:19, Joshua McKenzie
> wrote:
> 
>> Added unratified draft to the wiki here:
>> 
>> 
> 
>> 
> 
 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Apache+Cassandra+Project+Governance
>> 
>> I propose the following:
>> 
>> 1. We leave the vote open for 1 week (close at end of day
 6/23/20)
>> unless there's a lot of feedback on the wiki we didn't get on
 gdoc
>> 2. pmc votes are considered binding
>> 3. committer and community votes are considered advisory /
>> non-binding
>> 
>> Any objections / revisions to the above?
>> 
>> Thanks!
>> 
>> ~Josh
>> 
> 
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 
> 
 
>> 
>> ---

Re: [DISCUSS] Future of MVs

2020-06-30 Thread Jeremiah D Jordan
> So  from my PoV, I'm against us just voting to deprecate and remove without
> going into more depth into the current state of things and what options are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.

I agree with Josh here.  Multiple people have put in effort to improve the 
stability of MV’s since they were first put into the code base and the reasons 
for having them be in the DB have not changed.  Building MV like tables at the 
client level is actually harder to get right than doing it in the server.

-Jeremiah


> On Jun 30, 2020, at 3:45 PM, Joshua McKenzie  wrote:
> 
> We're just short of 98 tickets on the component since it's original merge
> so at least *some* work has been done to stabilize them. Not to say I'm
> endorsing running them at massive scale today without knowing what you're
> doing, to be clear. They are perhaps our largest loaded gun of a feature of
> self-foot-shooting atm. Zhao did a bunch of work on them internally and
> we've backported much of that to OSS; I've pinged him to chime in here.
> 
> The "data is orphaned in your view when you lose all base replicas" issue
> is more or less "unsolvable", since a scan of a view to confirm data in the
> base table is so slow you're talking weeks to process and it totally
> trashes your page cache. I think Paulo landed on a "you have to rebuild the
> view if you lose all base data" reality. There's also, I believe, the
> unresolved issue of modeling how much data a base table with one to many
> views will end up taking up in its final form when denormalized. This could
> be vastly improved with something like an "EXPLAIN ANALYZE" for a table
> with views, if you'll excuse the mapping, to show "N bytes in base will
> become M with base + views" or something.
> 
> Last but definitely not least in dumping the state in my head about this,
> there's a bunch of potential for guardrailing people away from self-harm
> with MV's if we decide to go the route of guardrails (link:
> https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
> ).
> 
> So  from my PoV, I'm against us just voting to deprecate and remove without
> going into more depth into the current state of things and what options are
> on the table, since people will continue to build MV's at the client level
> which, in theory, should have worse correctness and performance
> characteristics than having a clean and well stabilized implementation in
> the coordinator.
> 
> Having them flagged as experimental for now as we stabilize 4.0 and get
> things out the door *seems* sufficient to me, but if people are widely
> using these out in the wild and ignoring that status and the corresponding
> warning, maybe we consider raising the volume on that warning for 4.0 while
> we figure this out.
> 
> Just my .02.
> 
> ~Josh
> 
> On Tue, Jun 30, 2020 at 4:22 PM Dinesh Joshi  wrote:
> 
>>> On Jun 30, 2020, at 12:43 PM, Jon Haddad  wrote:
>>> 
>>> As we move forward with the 4.0 release, we should consider this an
>>> opportunity to deprecate materialized views, and remove them in 5.0.  We
>>> should take this opportunity to learn from the mistake and raise the bar
>>> for new features to undergo a much more thorough run the wringer before
>>> merging.
>> 
>> I'm in favor of marking them as deprecated and removing them in 5.0. If
>> someone steps up and can fix them in 5.0, then we always have the option of
>> accepting the fix.
>> 
>> Dinesh
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] CEP-7 Storage Attached Index

2020-09-23 Thread Jeremiah D Jordan
> Short question: looking forward, how are we going to maintain three 2i
> implementations: SASI, SAI, and 2i?

I think one of the goals stated in the CEP is for SAI to have parity with 2i 
such that it could eventually replace it.


> On Sep 23, 2020, at 10:34 AM, Oleksandr Petrov  
> wrote:
> 
> Short question: looking forward, how are we going to maintain three 2i
> implementations: SASI, SAI, and 2i?
> 
> Another thing I think this CEP is missing is rationale and motivation
> about why trie-based indexes were chosen over, say, B-Tree. We did have a
> short discussion about this on Slack, but both arguments that I've heard
> (space-saving and keeping a small subset of nodes in memory) work only for
> the most primitive implementation of a B-Tree. Fully-occupied prefix B-Tree
> can have similar properties. There's been a lot of research on B-Trees and
> optimisations in those. Unfortunately, I do not have an
> implementation sitting around for a direct comparison, but I can imagine
> situations when B-Trees may perform better because of simpler construction.
> Maybe we should even consider prototyping a prefix B-Tree to have a more
> fair comparison.
> 
> Thank you,
> -- Alex
> 
> 
> 
> On Thu, Sep 10, 2020 at 9:12 AM Jasonstack Zhao Yang <
> jasonstack.z...@gmail.com> wrote:
> 
>> Thank you Patrick for hosting Cassandra Contributor Meeting for CEP-7 SAI.
>> 
>> The recorded video is available here:
>> 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/2020-09-01+Apache+Cassandra+Contributor+Meeting
>> 
>> On Tue, 1 Sep 2020 at 14:34, Jasonstack Zhao Yang <
>> jasonstack.z...@gmail.com>
>> wrote:
>> 
>>> Thank you, Charles and Patrick
>>> 
>>> On Tue, 1 Sep 2020 at 04:56, Charles Cao  wrote:
>>> 
 Thank you, Patrick!
 
 On Mon, Aug 31, 2020 at 12:59 PM Patrick McFadin 
 wrote:
> 
> I just moved it to 8AM for this meeting to better accommodate APAC.
 Please
> see the update here:
> 
 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/2020-08-01+Apache+Cassandra+Contributor+Meeting
> 
> Patrick
> 
> On Mon, Aug 31, 2020 at 10:04 AM Charles Cao 
 wrote:
> 
>> Patrick,
>> 
>> 11AM PST is a bad time for the people in the APAC timezone. Can we
>> move it to 7 or 8AM PST in the morning to accommodate their needs ?
>> 
>> ~Charles
>> 
>> On Fri, Aug 28, 2020 at 4:37 PM Patrick McFadin >> 
>> wrote:
>>> 
>>> Meeting scheduled.
>>> 
>> 
 
>> https://cwiki.apache.org/confluence/display/CASSANDRA/2020-08-01+Apache+Cassandra+Contributor+Meeting
>>> 
>>> Tuesday September 1st, 11AM PST. I added a basic bullet for the
 agenda
>> but
>>> if there is more, edit away.
>>> 
>>> Patrick
>>> 
>>> On Thu, Aug 27, 2020 at 11:31 AM Jasonstack Zhao Yang <
>>> jasonstack.z...@gmail.com> wrote:
>>> 
 +1
 
 On Thu, 27 Aug 2020 at 04:52, Ekaterina Dimitrova <
>> e.dimitr...@gmail.com>
 wrote:
 
> +1
> 
> On Wed, 26 Aug 2020 at 16:48, Caleb Rackliffe <
>> calebrackli...@gmail.com>
> wrote:
> 
>> +1
>> 
>> 
>> 
>> On Wed, Aug 26, 2020, 3:45 PM Patrick McFadin <
 pmcfa...@gmail.com>
> wrote:
>> 
>> 
>> 
>>> This is related to the discussion Jordan and I had about
>> the
> contributor
>> 
>>> Zoom call. Instead of open mic for any issue, call it
>> based
 on a
>> discussion
>> 
>>> thread or threads for higher bandwidth discussion.
>> 
>>> 
>> 
>>> I would be happy to schedule on for next week to
 specifically
>> discuss
>> 
>>> CEP-7. I can attach the recorded call to the CEP after.
>> 
>>> 
>> 
>>> +1 or -1?
>> 
>>> 
>> 
>>> Patrick
>> 
>>> 
>> 
>>> On Tue, Aug 25, 2020 at 7:03 AM Joshua McKenzie <
 jmcken...@apache.org>
>> 
>>> wrote:
>> 
>>> 
>> 
> 
>> 
> Does community plan to open another discussion or CEP
>> on
>> 
>>> modularization?
>> 
 
>> 
 We probably should have a discussion on the ML or
>> monthly
>> contrib
> call
>> 
 about it first to see how aligned the interested
 contributors
>> are.
>> Could
>> 
>>> do
>> 
 that through CEP as well but CEP's (at least thus far
 sans k8s
>> operator)
>> 
 tend to start with a strong, deeply thought out point of
 view
>> being
>> 
 expressed.
>> 
 
>> 
 On Tue, Aug 25, 2020 at 3:26 AM Jasonstack Zhao Yang <
>> 
>>>

Re: Welcome Jordan West, David Capwell, Zhao Yang and Ekaterina Dimitrova as Cassandra committers

2020-12-16 Thread Jeremiah D Jordan
Congratulations everyone!  Good to see the project getting new committers.

> On Dec 16, 2020, at 10:55 AM, Benjamin Lerer  
> wrote:
> 
> The PMC's members are pleased to announce that Jordan West, David Capwell,
> Zhao Yang and Ekaterina Dimitrova have accepted the invitations to become
> committers this year.
> 
> Jordan West accepted the invitation in April
> David Capwell accepted the invitation in July
> Zhao Yang accepted the invitation in September
> Ekaterina Dimitrova accepted the invitation in December
> 
> Thanks a lot for everything you have done.
> 
> Congratulations and welcome
> 
> The Apache Cassandra PMC members


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [DISCUSS] Releases after 4.0

2021-01-28 Thread Jeremiah D Jordan
I think we are confusing things between minor vs patch.  Can we talk about 
branch names?

I think we can agree on the following statements?

Releases made from stable maintenance branches, 
cassandra-3.0/cassandra-3.11/cassandra-4.0 (once created), will limit features 
being added to them and should be mostly bug fix only.

New features will be developed in trunk.

Now I think the thing under discussion here is “how often will we cut new 
maintenance branches from trunk” and also “how long will we maintain those 
branches"

I would definitely like to see the project able to release binaries from trunk 
more often then once a year.  As long as we keep our quality goals in line post 
4.0 GA I think this is possible.

> I'd like to see us have three branches: life support (critical fixes), stable 
> (fixes), and development. Minors don't fit very well into that IMO.

If we want to go with semver ideas, then minors fit perfectly well.  Server 
doesn’t meant you make patch releases for every version you have ever released, 
it is just a way of versioning the releases so people can understand what the 
upgrade semantics are for that release.  If you dropped support for some 
existing thing, you need to bump the major version, if you added something new 
you bump the minor version, if you only fixed bugs with no user visible changes 
you bump the patch version.

> I suppose in practice all this wouldn't be too different to tick-tock, just 
> with a better state of QA, a higher bar to merge and (perhaps) no fixed 
> release cadence. This realisation makes me less keen on it, for some reason.

I was thinking something along these lines might be useful as well.

I could see a process where we cut new maintenance branches every X time, ~1 
year?, 6 months?, we would fix bugs and make patch releases from those 
maintenance branches.
We would also cut releases from the development branch (trunk) more often.  The 
version number in trunk would be updated based on what had changed since the 
last release made from trunk.  If we dropped support for something since the 
last release, bump major.  If we added new features (most likely thing), bump 
minor.

So when we release 4.0 we cut the cassandra-4.0 maintenance branch.  We make 
future 4.0.1 4.0.2 4.0.3 releases from this branch.

Trunk continues development, some new features are added there.  After a few 
months we release 4.1.0 from trunk, we do not cut a cassandra-4.1 branch.  
Development continues along on trunk, some new features get in so we bump the 
version in the branch to 4.2.0.  A few months go by we release 4.2.0 from 
trunk.  Some bug fixes go into trunk with no new features, the version on the 
branch bumps to 4.2.1, we decide to make a release from trunk, and only fixes 
have gone into trunk since the last release, so we release 4.2.1 from trunk.

We continue on this way releasing 4.3.0, 4.4.0, 4.4.1 …. We decide it is time 
for a new maintenance branch to be cut.  So with the release of 4.5.0 we also 
cut the cassandra-4.5 branch.  This branch will get patch releases made from it 
4.5.1 4.5.2 4.5.3.

Trunk continues on as 4.6.0, 4.7.0, 4.8.0 …. At some point the project decides 
it wants to drop support for some deprecated feature, trunk gets bumped to 
5.0.0. More releases happen from trunk 5.0.0, 5.1.0, 5.2.0, 5.2.1 development 
on trunk continues on.  Time for a new maintenance branch with 5.3.0 so 
cassandra-5.3 gets cut...

This does kind of look like what we tried for tick/tock, but it is not the 
same.  If we wanted to name this something, I would call it something like 
"releasable trunk+periodic maintenance branching”.  This is what many projects 
that release from trunk look like.

-Jeremiah


> On Jan 28, 2021, at 10:31 AM, Benedict Elliott Smith  
> wrote:
> 
> But, as discussed, we previously agreed limit features in a minor version, as 
> per the release lifecycle (and I continue to endorse this decision)
> 
> On 28/01/2021, 16:04, "Mick Semb Wever"  wrote:
> 
>> if there's no such features, or anything breaking compatibility
>> 
>> What do you envisage being delivered in such a release, besides bug
>> fixes?  Do we have the capacity as a project for releases dedicated to
>> whatever falls between those two gaps?
>> 
> 
> 
>All releases that don't break any compatibilities as our documented
>guidelines dictate (wrt. upgrades, api, cql, native protocol, etc).  Even
>new features can be introduced without compatibility breakages (and should
>be as often as possible).
> 
>Honouring semver does not imply more releases, to the contrary it is just
>that a number of those existing releases will be minor instead of major.
>That is, it is an opportunity cost to not recognise minor releases.
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-

Re: March 2015 QA retrospective

2015-04-09 Thread Jeremiah D Jordan
CASSANDRA-8687
>  Jeremiah Jordan 
> Keyspace
> should also check Config.isClientMode Is there a way to test for missing
> Config.isClientMode checks?

We should probably redo the client mode type stuff.  Code should assume we are 
in a tool until isServerMode or something similar is set.  In general, we have 
a lot of offline tools now, and we probably need to improve the testing of said 
tools.

Re: Staging Branches

2015-05-07 Thread Jeremiah D Jordan
"Our process is our own" <- always remember this.

> On May 7, 2015, at 9:25 AM, Ariel Weisberg  
> wrote:
> 
> Hi,
> 
> Whoah. Our process is our own. We don't have to subscribe to any cargo cult
> book buying seminar giving process.
> 
> And whatever we do we can iterate and change until it works for us and
> solves the problems we want solved.
> 
> Ariel
> 
> On Thu, May 7, 2015 at 10:13 AM, Aleksey Yeschenko 
> wrote:
> 
>> Strictly speaking, the train schedule does demand that trunk, and all
>> other branches, must be releasable at all times, whether you like it or not
>> (for the record - I *don’t* like it, but here we are).
>> 
>> This, and other annoying things, is what be subscribed to tick-tock vs.
>> supported branches experiment.
>> 
>>> We still need to run CI before we release. So what does this buy us?
>> 
>> Ideally (eventually?) we won’t have to run CI, including duration tests,
>> before we release, because we’ll never merge anything that hadn’t passed
>> the full suit, including duration tests.
>> 
>> That said, perhaps it’s too much change at once. We still have missing
>> pieces of infrastructure, and TE is busy with what’s already back-logged.
>> So let’s revisit this proposal in a few months, closer to 3.1 or 3.2, maybe?
>> 
>> --
>> AY
>> 
>> On May 7, 2015 at 16:56:07, Ariel Weisberg (ariel.weisb...@datastax.com)
>> wrote:
>> 
>> Hi,
>> 
>> I don't think this is necessary. If you merge with trunk, test, and someone
>> gets in a head of you just merge up and push to trunk anyways. Most of the
>> time the changes the other person made will be unrelated and they will
>> compose fine. If you actually conflict then yeah you test again but this
>> doesn't happen often.
>> 
>> The goal isn't to have trunk passing every single time it's to have it pass
>> almost all the time so the test history means something and when it fails
>> it fails because it's broken by the latest merge.
>> 
>> At this size I don't see the need for a staging branch to prevent trunk
>> from ever breaking. There is a size where it would be helpful I just don't
>> think we are there yet.
>> 
>> Ariel
>> 
>> On Thu, May 7, 2015 at 5:05 AM, Benedict Elliott Smith <
>> belliottsm...@datastax.com> wrote:
>> 
>>> A good practice as a committer applying a patch is to build and run the
>>> unit tests before updating the main repository, but to do this for every
>>> branch is infeasible and impacts local productivity. Alternatively,
>>> uploading the result to your development tree and waiting a few hours for
>>> CI to validate it is likely to result in a painful cycle of race-to-merge
>>> conflicts, rebasing and waiting again for the tests to run.
>>> 
>>> So I would like to propose a new strategy: staging branches.
>>> 
>>> Every major branch would have a parallel branch:
>>> 
>>> cassandra-2.0 <- cassandra-2.0_staging
>>> cassandra-2.1 <- cassandra-2.1_staging
>>> trunk <- trunk_staging
>>> 
>>> On commit, the idea would be to perform the normal merge process on the
>>> _staging branches only. CI would then run on every single git ref, and as
>>> these passed we would fast forward the main branch to the latest
>> validated
>>> staging git ref. If one of them breaks, we go and edit the _staging
>> branch
>>> in place to correct the problem, and let CI run again.
>>> 
>>> So, a commit would look something like:
>>> 
>>> patch -> cassandra-2.0_staging -> cassandra-2.1_staging -> trunk_staging
>>> 
>>> wait for CI, see 2.0, 2.1 are fine but trunk is failing, so
>>> 
>>> git rebase -i trunk_staging 
>>> fix the problem
>>> git rebase --continue
>>> 
>>> wait for CI; all clear
>>> 
>>> git checkout cassandra-2.0; git merge cassandra-2.0_staging
>>> git checkout cassandra-2.1; git merge cassandra-2.1_staging
>>> git checkout trunk; git merge trunk_staging
>>> 
>>> This does introduce some extra steps to the merge process, and we will
>> have
>>> branches we edit the history of, but the amount of edited history will be
>>> limited, and this will remain isolated from the main branches. I'm not
>> sure
>>> how averse to this people are. An alternative policy might be to enforce
>>> that we merge locally and push to our development branches then await CI
>>> approval before merging. We might only require this to be repeated if
>> there
>>> was a new merge conflict on final commit that could not automatically be
>>> resolved (although auto-merge can break stuff too).
>>> 
>>> Thoughts? It seems if we want an "always releasable" set of branches, we
>>> need something along these lines. I certainly break tests by mistake, or
>>> the build itself, with alarming regularity. Fixing with merges leaves a
>>> confusing git history, and leaves the build broken for everyone else in
>> the
>>> meantime, so patches applied after, and development branches based on
>> top,
>>> aren't sure if they broke anything themselves.
>>> 
>> 



Re: Requiring Java 8 for C* 3.0

2015-05-07 Thread Jeremiah D Jordan
With Java 7 being EOL for free versions I am +1 on this.  If you want to stick 
with 7, you can always keep running 2.1.

> On May 7, 2015, at 11:09 AM, Jonathan Ellis  wrote:
> 
> We discussed requiring Java 8 previously and decided to remain Java
> 7-compatible, but at the time we were planning to release 3.0 before Java 7
> EOL.  Now that 8099 and increased emphasis on QA have delayed us past Java
> 7 EOL, I think it's worth reopening this discussion.
> 
> If we require 8, then we can use lambdas, LongAdder, StampedLock, Streaming
> collections, default methods, etc.  Not just in 3.0 but over 3.x for the
> next year.
> 
> If we don't, then people can choose whether to deploy on 7 or 8 -- but the
> vast majority will deploy on 8 simply because 7 is no longer supported
> without a premium contract with Oracle.  8 also has a more advanced G1GC
> implementation (see CASSANDRA-7486).
> 
> I think that gaining access to the new features in 8 as we develop 3.x is
> worth losing the ability to run on a platform that will have been EOL for a
> couple months by the time we release.
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced



Re: Cassandra Java Driver and DataStax

2016-06-06 Thread Jeremiah D Jordan
The Apache Cassandra project has always left development of its drivers up to 
the community.  The DataStax Java Driver is not part of the Apache Cassandra 
project, it is an open source project created by DataStax.  You can find a 
large list of drivers for Cassandra here: 
https://wiki.apache.org/cassandra/ClientOptions some of them developed by 
DataStax, some developed by Netflix, and many others.

-Jeremiah

> On Jun 3, 2016, at 9:29 PM, Chris Mattmann  wrote:
> 
> Hi All,
> 
> I’m investigating something a few ASF members contacted
> me about and pointed out, so I’m hoping you can help 
> guide me here as a community. I have heard that a company,
> DataStax, whose marketing material mentions it as the only
> Cassandra vendor, “controls” the Java Driver for Apache 
> Cassandra. 
> 
> Of course, no company “controls” our projects or its code,
> so I told the folks that mentioned it to me that I’d investigate
> with my board hat on.
> 
> I’d like to hear the community’s thoughts here on this. Does
> anyone in the community see this “controlling” behavior going
> on? Please speak up, as I’d like to get to the bottom of it,
> and I’ll be around on the lists, doing some homework and reading
> up on the archives to see what’s up.
> 
> Thanks for any help you can provide in rooting this out.
> 
> Cheers,
> Chris
> 
> 
> 
> 



Re: Reminder: critical fixes only in 2.1

2016-07-18 Thread Jeremiah D Jordan
Looking at those tickets in all three of them the “is this critical to fix” 
question came up in the JIRA discussion and it was decided that they were 
indeed critical enough to commit to 2.1.

> On Jul 18, 2016, at 11:47 AM, Jonathan Ellis  wrote:
> 
> We're at the stage of the release cycle where we should be committing
> critical fixes only to the 2.1 branch.  Many people depend on 2.1 working
> reliably and it's not worth the risk of introducing regressions for (e.g.)
> performance improvements.
> 
> I think some of the patches committed so far for 2.1.16 do not meet this
> bar and should be reverted.  I include a summary of what people have to
> live with if we leave them unfixed:
> 
> https://issues.apache.org/jira/browse/CASSANDRA-11349
>  Repair suffers false-negative tree mismatches and overstreams data.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-10433
>  Reduced performance on inserts (and reads?) (for Thrift clients only?)
> 
> https://issues.apache.org/jira/browse/CASSANDRA-12030
>  Reduced performance on reads for workloads with range tombstones
> 
> Anyone want to make a case that these are more critical than they appear
> and should not be reverted?
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced



Re: State of Unit tests (1 out of 100 passes in trunk)

2016-07-21 Thread Jeremiah D Jordan
> Josh, add me to the "test fixers" queue, as well. However, I think the
> authors of patches that break the build should also be on the hook for
> fixing problems, as well.

+1 I have always been a fan of “you broke it, you fix it"



Re: DSE vs Open Source

2016-07-21 Thread Jeremiah D Jordan
Hi John,

I work for the DSE team.  What you're seeing is the result of DSE having its 
own release schedule distinct from Apache Cassandra.  We'll start qualifying an 
Apache release to build on, such as 3.0.7 for DSE 5.0, but if 3.0.8 comes out 
while we're still working on 5.0.1 we won't necessarily restart that QA 
process.  Often it makes more sense to take selected changes from that release 
instead.

Or, sometimes our customers will find an issue before the Apache community, and 
we'll ship a hotfix version of DSE with a fix that goes into a subsequent 
Apache release.

Either way, all our fixes do get contributed to the community, and we publish a 
list of changes compared to the Apache release we built on, e.g. [1] and [2] 
for DSE 4.8 and 5.0.

We also include Apache Solr and Apache Spark in DSE and follow the same 
processes for those projects.

[1] 
https://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/RNcassChanges.html?scroll=RNcassFixes__488_unique_1
[2] 
https://docs.datastax.com/en/latest-dse/datastax_enterprise/RNcassChanges.html?scroll=RNcassFixes__500_unique_1

Jeremiah Jordan
Lead Software Engineer DSE
DataStax, Inc.

> On Jul 21, 2016, at 10:22 AM, John John  wrote:
> 
> 
>  
> What is the difference behind Datastax DSE Cassandra and open source. 
>  
> 1. Why is Datastax maintaining a fork of open source where they back port 
> fixes which are not back ported for the community for that version. People 
> running DSE wants more stability thats why?  
>  
> 2. I know community moved to a new release process which will make it very 
> hard for  companies to use open source. I have been told DSE will still have 
> long term support for release similar to old release process? 
>  
> I am very happy to use DSE since it gives more features but was confused why 
> we are in this situation. I am sure I am missing something.   



Re: A proposal to move away from Jira-centric development

2016-08-15 Thread Jeremiah D Jordan
I like keeping things in JIRA because then everything is in one place, and it 
is easy to refer someone to it in the future.
But I agree that JIRA tickets with a bunch of design discussion and POC’s and 
such in them can get pretty long and convoluted.

I don’t really like the idea of moving all of that discussion to email which 
makes it has harder to point someone to it.  Maybe a better idea would be to 
have a “design/POC” JIRA and an “implementation” JIRA.  That way we could still 
keep things in JIRA, but the final decision would be kept “clean”.

Though it would be nice if people would send an email to the dev list when 
proposing “design” JIRA’s, as not everyone has time to follow every JIRA ever 
made to see that a new design JIRA was created that they might be interested in 
participating on.

My 2c.

-Jeremiah


> On Aug 15, 2016, at 9:22 AM, Jonathan Ellis  wrote:
> 
> A long time ago, I was a proponent of keeping most development discussions
> on Jira, where tickets can be self contained and the threadless nature
> helps keep discussions from getting sidetracked.
> 
> But Cassandra was a lot smaller then, and as we've grown it has become
> necessary to separate out the signal (discussions of new features and major
> changes) from the noise of routine bug reports.
> 
> I propose that we take advantage of the dev list to perform that
> separation.  Major new features and architectural improvements should be
> discussed first here, then when consensus on design is achieved, moved to
> Jira for implementation and review.
> 
> I think this will also help with the problem when the initial idea proves
> to be unworkable and gets revised substantially later after much
> discussion.  It can be difficult to figure out what the conclusion was, as
> review comments start to pile up afterwards.  Having that discussion on the
> list, and summarizing on Jira, would mitigate this.
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced



Re: A proposal to move away from Jira-centric development

2016-08-15 Thread Jeremiah D Jordan
>  In fact, I don’t see JIRA sent to the dev list at all so you are basically
> forking the conversation to a high noise list by putting it all in JIRA.

This is why I proposed we send a link to the design lira’s to the dev list.

> Putting discussion in JIRA, is fine, but realize,
> there is a lot of noise in that signal and people may or may not be watching

I don’t see how a JIRA dedicated to a specific issue is “high noise” ?  That 
single JIRA is much lower noise, it only has conversations around that specific 
ticket.  All conversations happening on the dev list at once seems much “higher 
noise” to me.

-Jeremiah

> On Aug 15, 2016, at 12:22 PM, Chris Mattmann  wrote:
> 
> Discussion belongs on the dev list. Putting discussion in JIRA, is fine, but 
> realize,
> there is a lot of noise in that signal and people may or may not be watching
> the JIRA list. In fact, I don’t see JIRA sent to the dev list at all so you 
> are basically
> forking the conversation to a high noise list by putting it all in JIRA.
> 
> 
> 
> 
> 
> On 8/15/16, 10:11 AM, "Aleksey Yeschenko"  wrote:
> 
>I too feel like it would be sufficient to announce those major JIRAs on 
> the dev@ list, but keep all discussion itself to JIRA, where it belongs.
> 
>You don’t need to follow every ticket this way, just subscribe to dev@ and 
> then start watching the select major JIRAs you care about.
> 
>-- 
>AY
> 
>On 15 August 2016 at 18:08:20, Jeremiah D Jordan 
> (jeremiah.jor...@gmail.com) wrote:
> 
>I like keeping things in JIRA because then everything is in one place, and 
> it is easy to refer someone to it in the future.  
>But I agree that JIRA tickets with a bunch of design discussion and POC’s 
> and such in them can get pretty long and convoluted.  
> 
>I don’t really like the idea of moving all of that discussion to email 
> which makes it has harder to point someone to it. Maybe a better idea would 
> be to have a “design/POC” JIRA and an “implementation” JIRA. That way we 
> could still keep things in JIRA, but the final decision would be kept 
> “clean”.  
> 
>Though it would be nice if people would send an email to the dev list when 
> proposing “design” JIRA’s, as not everyone has time to follow every JIRA ever 
> made to see that a new design JIRA was created that they might be interested 
> in participating on.  
> 
>My 2c.  
> 
>-Jeremiah  
> 
> 
>> On Aug 15, 2016, at 9:22 AM, Jonathan Ellis  wrote:  
>> 
>> A long time ago, I was a proponent of keeping most development discussions  
>> on Jira, where tickets can be self contained and the threadless nature  
>> helps keep discussions from getting sidetracked.  
>> 
>> But Cassandra was a lot smaller then, and as we've grown it has become  
>> necessary to separate out the signal (discussions of new features and major  
>> changes) from the noise of routine bug reports.  
>> 
>> I propose that we take advantage of the dev list to perform that  
>> separation. Major new features and architectural improvements should be  
>> discussed first here, then when consensus on design is achieved, moved to  
>> Jira for implementation and review.  
>> 
>> I think this will also help with the problem when the initial idea proves  
>> to be unworkable and gets revised substantially later after much  
>> discussion. It can be difficult to figure out what the conclusion was, as  
>> review comments start to pile up afterwards. Having that discussion on the  
>> list, and summarizing on Jira, would mitigate this.  
>> 
>> --  
>> Jonathan Ellis  
>> Project Chair, Apache Cassandra  
>> co-founder, http://www.datastax.com  
>> @spyced  
> 
> 
> 
> 



Re: A proposal to move away from Jira-centric development

2016-08-15 Thread Jeremiah D Jordan
> 1. I’d suggest setting up an iss...@cassandra.apache.org mailing list which 
> posts all changes to JIRA tickets (comments, issue reassignments, status 
> changes). This could be subscribed to like any other mailing list, and while 
> this list would be high volume it increases transparency of what’s happening 
> across the project.

For anyone who wants to follow that stream for Apache Cassandra we have 
commits@ setup for this.  
https://lists.apache.org/list.html?comm...@cassandra.apache.org 
<https://lists.apache.org/list.html?comm...@cassandra.apache.org>

> On Aug 15, 2016, at 2:06 PM, Dave Lester  wrote:
> 
> For all Apache projects, mailing lists are the source of truth. See: "If it 
> didn't happen on a mailing list, it didn't happen." 
> https://community.apache.org/newbiefaq.html#is-there-a-code-of-conduct-for-apache-projects
>  
> <https://community.apache.org/newbiefaq.html#is-there-a-code-of-conduct-for-apache-projects>
> 
> In response to Jason’s question, here are two things I’ve seen work well in 
> the Apache Mesos community:
> 
> 1. I’d suggest setting up an iss...@cassandra.apache.org mailing list which 
> posts all changes to JIRA tickets (comments, issue reassignments, status 
> changes). This could be subscribed to like any other mailing list, and while 
> this list would be high volume it increases transparency of what’s happening 
> across the project.
> 
> For Apache Mesos, we have a issues@mesos list: 
> https://lists.apache.org/list.html?iss...@mesos.apache.org 
> <https://lists.apache.org/list.html?iss...@mesos.apache.org> for this 
> purpose. It can be hugely valuable for keeping tabs on what’s happening in 
> the project. If there’s interest in creating this for Cassandra, here’s a 
> link to the original INFRA ticket as a reference: 
> https://issues.apache.org/jira/browse/INFRA-7971 
> <https://issues.apache.org/jira/browse/INFRA-7971>
> 
> 2. Apache Mesos has formalized process of design documents / feature 
> development, to encourage community discussion prior to being committed — 
> this discussion takes place on the mailing list and often has less to do with 
> the merits of a particular patch as much as it does on an overall design, its 
> relationship to dependencies, its usage, or larger issues about the direction 
> of a feature. These discussions belong on the mailing list.
> 
> To keep these discussions / design documents connected to JIRA we attach 
> links to JIRA issues. For example: 
> https://cwiki.apache.org/confluence/display/MESOS/Design+docs+--+Shared+Links 
> <https://cwiki.apache.org/confluence/display/MESOS/Design+docs+--+Shared+Links>.
>  The design doc approach is more of a formalization of what Jonathan 
> originally proposed.
> 
> Dave
> 
>> On Aug 15, 2016, at 11:34 AM, Jason Brown  wrote:
>> 
>> Chris,
>> 
>> Can you give a few examples of other healthy Apache projects which you feel
>> would be good example? Note: I'm not trying to bait the conversation, but
>> am genuinely interested in what other successful projects do.
>> 
>> Thanks
>> 
>> Jason
>> 
>> On Monday, August 15, 2016, Chris Mattmann  wrote:
>> 
>>> s/dev list followers//
>>> 
>>> That’s (one of) the disconnect(s). It’s not *you the emboldened, powerful
>>> PMC*
>>> and then everyone else.
>>> 
>>> 
>>> On 8/15/16, 11:25 AM, "Jeremy Hanna" >> > wrote:
>>> 
>>>   Regarding high level linking, if I’m in irc or slack or hipchat or a
>>> mailing list thread, it’s easy to reference a Jira ID and chat programs can
>>> link to it and bots can bring up various details.  I don’t think a hash id
>>> for a mailing list is as simple or memorable.
>>> 
>>>   A feature of a mailing list thread is that it can go in different
>>> directions easily.  The burden is that it will be harder to follow in the
>>> future if you’re trying to sort out implementation details.  So for high
>>> level discussion, the mailing list is great.  When getting down to the
>>> actual work and discussion about that focused work, that’s where a tool
>>> like Jira comes in.  Then it is reference-able in the changes.txt and other
>>> things.
>>> 
>>>   I think the approach proposed by Jonathan is a nice way to keep dev
>>> list followers informed but keeping ticket details focused.
>>> 
>>>> On Aug 15, 2016, at 1:12 PM, Chris Mattmann >> > wrote:
>>>> 
>>>> How is it harder to point someone to mail?
>>>> 
>>>> Have you seen lists.apache

Re: A proposal to move away from Jira-centric development

2016-08-16 Thread Jeremiah D Jordan
Back to the topic at hand.  First, let us establish that all of this stuff will 
be happening “on the mailing lists”, all JIRA updates are sent to commits@ with 
the reply-to set to dev@, so “JIRA” is still “on the list".

Now we just need to decide how we would like to best make use of these lists.  
I propose that we keep dev@ fairly low volume so that people don’t feel the 
need to filter it out of their inbox, and thus possibly miss important 
discussions.
If someone cares so much about the name of the list where stuff happens, then I 
propose we make dev-announce@ and if that happens we can replace commits@ with 
dev@ below and dev@ with dev-announce@ and start forwarding some JIRA stuff to 
dev@…

In order to keep dev@ low volume (but higher than it currently is, as it has 
mostly been “no volume” lately) I propose the following:

Someone has a major feature that they would like to discuss.  (Again this is 
just for major features, not every day bug fixes etc)
1. Make a JIRA for the thing you want to discuss (aka post the thing to 
commits@)
2. Post link to JIRA with a short description to dev@
3. Have a discussion on the JIRA (aka commits@) about the new thing.
4. If there is some major change/question on the JIRA that people feel needs 
some extra discussion/involvement email dev@ with question and link back to the 
JIRA
5. Have more discussions on the JIRA (aka commits@) about the new thing.
6. If something else comes up go back too step 4.
7. During this process of decision making keep the “Title” and “Description” 
fields of the JIRA (aka commits@) up to date with what is actually happening in 
the ticket.
8. Once things settle down make sub tasks or follow on tickets for actually 
implementing things linked to the initial ticket.

That would keep the dev@ list informed of what is going on in new feature 
proposals, and it will keep discussions on JIRA tickets where they are easily 
referenced and kept in one place, so it is easy to get to, and easy for.

-Jeremiah

> On Aug 15, 2016, at 9:22 AM, Jonathan Ellis  wrote:
> 
> A long time ago, I was a proponent of keeping most development discussions
> on Jira, where tickets can be self contained and the threadless nature
> helps keep discussions from getting sidetracked.
> 
> But Cassandra was a lot smaller then, and as we've grown it has become
> necessary to separate out the signal (discussions of new features and major
> changes) from the noise of routine bug reports.
> 
> I propose that we take advantage of the dev list to perform that
> separation.  Major new features and architectural improvements should be
> discussed first here, then when consensus on design is achieved, moved to
> Jira for implementation and review.
> 
> I think this will also help with the problem when the initial idea proves
> to be unworkable and gets revised substantially later after much
> discussion.  It can be difficult to figure out what the conclusion was, as
> review comments start to pile up afterwards.  Having that discussion on the
> list, and summarizing on Jira, would mitigate this.
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced



A proposal to get dev@ more involved

2016-08-17 Thread Jeremiah D Jordan
The original thread I posted this in got hi-jacked with other discussions, so 
I’m making a new thread for this.

People have expressed interest in having more information flow to the dev@ list 
about major features/decisions, so that those who don’t follow commits@ or JIRA 
can still keep up with big things that are going on.

I don’t think we want to increase the volume too much, as then many people 
would feel the need to filter it out of their inbox, and thus possibly miss 
important discussions.

So in order to keep dev@ low volume (but higher than it currently is) I propose 
we do something along the lines of the following:

When someone has a major feature that they would like to discuss.  (Again this 
is just for major features, not every day bug fixes etc)
1. Make a JIRA for the thing you want to discuss (will automatically be sent to 
commits@)
2. Post link to JIRA with a short description to dev@
3. Have a discussion on the JIRA (will automatically be sent to commits@) about 
the new thing.
4. If there is some major change/question on the JIRA that people feel needs 
some extra discussion/involvement email dev@ with question and link back to the 
JIRA
5. Have more discussions on the JIRA (will automatically be sent to commits@) 
about the new thing.
6. If something else comes up go back too step 4.
7. When discussions do happen on dev@ about a ticket, those should be 
summarized on the JIRA as well so that everything is kept in one place.
7. During this process of decision making keep the “Title” and “Description” 
fields of the JIRA (aka commits@) up to date with what is actually happening in 
the ticket.
8. Once things settle down make sub tasks or follow on tickets for actually 
implementing things linked to the initial ticket.

That would keep the dev@ list informed of what is going on in new feature 
proposals, and it will keep discussions on JIRA tickets where they are easily 
referenced and kept in one place.

Keeping the title and description of the ticket up to date is an important part 
of this, so that when someone new looks at the JIRA they don’t need to read 
through 5 pages of comments to see what the current state of things is.

-Jeremiah

Re: Github pull requests

2016-08-26 Thread Jeremiah D Jordan
+1 for PR’s but if we start using them I think we should get them sent to 
commits@ instead of the dev@ they are currently sent to.

-Jeremiah

> On Aug 26, 2016, at 1:38 PM, Andres de la Peña  wrote:
> 
> +1 to GitHub PRs, I think it will make things easier.
> 
> El viernes, 26 de agosto de 2016, Jason Brown 
> escribió:
> 
>> D'oh, forgot to explicitly state that I am +1 one on the github PR proposal
>> :)
>> 
>> On Fri, Aug 26, 2016 at 11:07 AM, Jason Brown > > wrote:
>> 
>>> It seems to me we might get more contributions if we can lower the
>> barrier
>>> to participation. (see Jeff Beck's statement above)
>>> 
>>> +1 to Aleksey's sentiment about the Docs contributions.
>>> 
>>> On Fri, Aug 26, 2016 at 9:48 AM, Mark Thomas > > wrote:
>>> 
 On 26/08/2016 17:11, Aleksey Yeschenko wrote:
> Mark, I, for one, will be happy with the level of GitHub integration
 that Spark has, formal or otherwise.
 
 If Cassandra doesn't already have it, that should be a simple request to
 infra.
 
> As it stands right now, none of the committers/PMC members have any
 control over Cassandra Github mirror.
> 
> Which, among other things, means that we cannot even close the
 erroneously opened PRs ourselves,
> they just accumulate unless the PR authors is kind enough to close
 them. That’s really frustrating.
 
 No PMC currently has the ability to directly close PRs on GitHub. This
 is one of the things on the infra TODO list that is being looked at. You
 can close them via a commit comment that the ASF GitHub tooling picks
>> up.
 
 Mark
 
 
> 
> --
> AY
> 
> On 26 August 2016 at 17:07:29, Mark Thomas (ma...@apache.org
>> ) wrote:
> 
> On 26/08/2016 16:33, Jonathan Ellis wrote:
>> Hi all,
>> 
>> Historically we've insisted that people go through the process of
 creating
>> a Jira issue and attaching a patch or linking a branch to demonstrate
>> intent-to-contribute and to make sure we have a unified record of
 changes
>> in Jira.
>> 
>> But I understand that other Apache projects are now recognizing a
 github
>> pull request as intent-to-contribute [1] and some are even making
 github
>> the official repo, with an Apache mirror, rather than the other way
>> around. (Maybe this is required to accept pull requests, I am not
 sure.)
>> 
>> Should we revisit our policy here?
> 
> At the moment, the ASF Git repo is always the master, with GitHub as a
> mirror. That allows push requests to be made via GitHub.
> 
> Infra is exploring options for giving PMCs greater control over GitHub
> config (including allowing GitHub to be the master with a golden copy
> held at the ASF) but that is a work in progress.
> 
> As far as intent to contribute goes, there does appear to be a trend
> that the newer a project is to the ASF, the more formal the project
> makes process around recording intent to contribute. (The same can be
> said for other processes as well like Jira config.)
> 
> It is worth noting that all the ASF requires is that there is an
>> intent
> to contribute. Anything that can be reasonably read that way is fine.
> Many PMCs happily accept patches sent to the dev list (although they
>> may
> ask them to be attached to issues more so they don't get forgotten
>> than
> anything else). Pull requests are certainly acceptable.
> 
> My personal recommendation is don't put more formal process in place
> than you actually need. Social controls are a lot more flexible than
> technical ones and generally have a much lower overhead.
> 
> Mark
> 
>> 
>> [1] e.g. https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aclosed
> 
> 
 
 
>>> 
>> 
> 
> 
> -- 
> Andrés de la Peña
> 
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
> *



Re: #cassandra-dev IRC logging

2016-08-30 Thread Jeremiah D Jordan
Also just to make sure, this is logging for #cassandra-dev not #cassandra right?

-Jeremiah

> On Aug 30, 2016, at 3:11 PM, Jeff Jirsa  wrote:
> 
> http://wilderness.apache.org/channels/ 
> 
> 
> 
> On 8/30/16, 1:04 PM, "Jonathan Ellis"  > wrote:
> 
>> What is the process to access asfbot logs?
>> 
>> On Tue, Aug 30, 2016 at 3:03 PM, Jake Farrell  wrote:
>> 
>>> If there are no objections then, I am going to enable ASFBot and logging in
>>> #cassandra on freenode
>>> 
>>> -Jake
>>> 
>>> On Fri, Aug 26, 2016 at 6:59 PM, Dave Brosius 
>>> wrote:
>>> 
 If you wish to unsubscribe, send an email to
 
 mailto://dev-unsubscr...@cassandra.apache.org
 
 
 On 08/26/2016 04:49 PM, Gvb Subrahmanyam wrote:
 
> Please remove me from - dev@cassandra.apache.org
> 
> -Original Message-
> From: Jake Farrell [mailto:jfarr...@apache.org]
> Sent: Friday, August 26, 2016 4:36 PM
> To: dev@cassandra.apache.org
> Subject: Re: #cassandra-dev IRC logging
> 
> asfbot can log to wilderness for backup, but it does not send out
>>> digests.
> I've seen a couple of projects starting to test out and use
>>> slack/hipchat
> and then use sameroom to connect irc so conversations are not separated
>>> and
> people can use their favorite client of choice
> 
> -Jake
> 
> On Fri, Aug 26, 2016 at 4:20 PM, Edward Capriolo >>> 
> wrote:
> 
> Yes. I did. My bad.
>> 
>> On Fri, Aug 26, 2016 at 4:07 PM, Jason Brown 
>> wrote:
>> 
>> Ed, did you mean this to post this to the other active thread today,
>>> the one about github pull requests? (just want to make sure I'm
>>> understanding correctly :) )
>>> 
>>> On Fri, Aug 26, 2016 at 12:28 PM, Edward Capriolo
>>> >> 
>>> wrote:
>>> 
>>> One thing to watch out for. The way apache-gossip is setup the
 PR's get sent to the dev list. However the address is not part of
 the list so
 
>>> the
>> 
>>> project owners get an email asking to approve/reject every PR and
 
>>> comment
>> 
>>> on the PR.
 
 This is ok because we have a small quite group but you probably do
 not
 
>>> want
>>> 
 that with the number of SCM changes in the cassandra project.
 
 On Fri, Aug 26, 2016 at 3:05 PM, Jeff Jirsa <
 
>>> jeff.ji...@crowdstrike.com>
>> 
>>> wrote:
 
 +1 to both as well
> 
> On 8/26/16, 11:59 AM, "Tyler Hobbs"  wrote:
> 
> +1 on doing this and using ASFBot in particular.
>> 
>> On Fri, Aug 26, 2016 at 1:40 PM, Jason Brown
>> 
>> 
> wrote:
> 
>> @Dave ASFBot looks like a winner. If others are on board with
>>> 
>> this,
>> 
>>> I
>>> 
 can
> 
>> work on getting it up and going.
>>> 
>>> On Fri, Aug 26, 2016 at 11:27 AM, Dave Lester <
>>> 
>> dave_les...@apple.com>
>>> 
 wrote:
>>> 
>>> +1. Check out ASFBot for logging IRC, along with other
 
>>> integrations.[1]
> 
> 
> 
> 
> Disclaimer:  This message and the information contained herein is
> proprietary and confidential and subject to the Tech Mahindra policy
> statement, you may review the policy at 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.techmahindra.com_Di&d=DQIBaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=H4wofC2Uy8qTZ9EqZUIlT73N7caN-AfXR9CKCtbOYN0&s=og0lwkda3Lm-OSFcDxwh2eodgL0Xb71Dmxkhb2fIO7c&e=
>  
> 
>  
> sclaimer.html externally 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tim.techmahindra.com_&d=DQIBaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=H4wofC2Uy8qTZ9EqZUIlT73N7caN-AfXR9CKCtbOYN0&s=0XHqu1FcXwN6F9jkHUWPCd7qLcHiJB4v_-ANnLyQuSU&e=
>  
> 
>  
>>> tim/disclaimer.html
> internally within TechMahindra.
> 
> 

Re: Gossip 2.0

2016-09-01 Thread Jeremiah D Jordan
He denied it when I asked him that earlier.  But we know he did.
http://wilderness.apache.org/channels/?f=cassandra-dev/2016-09-01#1472732219 


> On Sep 1, 2016, at 11:02 AM, Eric Evans  wrote:
> 
> On Thu, Sep 1, 2016 at 7:02 AM, Jason Brown  wrote:
>> have opened up CASSANDRA-12345...
> 
> Nice; What did you do, camp on the "create" button until after 12344
> was submitted? :)
> 
> -- 
> Eric Evans
> john.eric.ev...@gmail.com



Re: Proposal - 3.5.1

2016-09-15 Thread Jeremiah D Jordan
I’m with Jeff on this, 3.7 (bug fixes on 3.6) has already been released with 
the fix.  Since the fix applies cleanly anyone is free to put it on top of 3.5 
on their own if they like, but I see no reason to put out a 3.5.1 right now and 
confuse people further.

-Jeremiah


> On Sep 15, 2016, at 9:07 AM, Jonathan Haddad  wrote:
> 
> As I follow up, I suppose I'm only advocating for a fix to the odd
> releases.  Sadly, Tick Tock versioning is misleading.
> 
> If tick tock were to continue (and I'm very much against how it currently
> works) the whole even-features odd-fixes thing needs to stop ASAP, all it
> does it confuse people.
> 
> The follow up to 3.4 (3.5) should have been 3.4.1, following semver, so
> people know it's bug fixes only to 3.4.
> 
> Jon
> 
> On Wed, Sep 14, 2016 at 10:37 PM Jonathan Haddad  wrote:
> 
>> In this particular case, I'd say adding a bug fix release for every
>> version that's affected would be the right thing.  The issue is so easily
>> reproducible and will likely result in massive data loss for anyone on 3.X
>> WHERE X < 6 and uses the "date" type.
>> 
>> This is how easy it is to reproduce:
>> 
>> 1. Start Cassandra 3.5
>> 2. create KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
>> 'replication_factor': 1};
>> 3. use test;
>> 4. create table fail (id int primary key, d date);
>> 5. delete d from fail where id = 1;
>> 6. Stop Cassandra
>> 7. Start Cassandra
>> 
>> You will get this, and startup will fail:
>> 
>> ERROR 05:32:09 Exiting due to error while processing commit log during
>> initialization.
>> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException:
>> Unexpected error deserializing mutation; saved to
>> /var/folders/0l/g2p6cnyd5kx_1wkl83nd3y4rgn/T/mutation6313332720566971713dat.
>> This may be caused by replaying a mutation against a table with the same
>> name but incompatible schema.  Exception follows:
>> org.apache.cassandra.serializers.MarshalException: Expected 4 byte long for
>> date (0)
>> 
>> I mean.. come on.  It's an easy fix.  It cleanly merges against 3.5 (and
>> probably the other releases) and requires very little investment from
>> anyone.
>> 
>> 
>> On Wed, Sep 14, 2016 at 9:40 PM Jeff Jirsa 
>> wrote:
>> 
>>> We did 3.1.1 and 3.2.1, so there’s SOME precedent for emergency fixes,
>>> but we certainly didn’t/won’t go back and cut new releases from every
>>> branch for every critical bug in future releases, so I think we need to
>>> draw the line somewhere. If it’s fixed in 3.7 and 3.0.x (x >= 6), it seems
>>> like you’ve got options (either stay on the tick and go up to 3.7, or bail
>>> down to 3.0.x)
>>> 
>>> Perhaps, though, this highlights the fact that tick/tock may not be the
>>> best option long term. We’ve tried it for a year, perhaps we should instead
>>> discuss whether or not it should continue, or if there’s another process
>>> that gives us a better way to get useful patches into versions people are
>>> willing to run in production.
>>> 
>>> 
>>> 
>>> On 9/14/16, 8:55 PM, "Jonathan Haddad"  wrote:
>>> 
 Common sense is what prevents someone from upgrading to yet another
 completely unknown version with new features which have probably broken
 even more stuff that nobody is aware of.  The folks I'm helping right
 deployed 3.5 when they got started because
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__cassandra.apache.org&d=DQIBaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=MZ9nLcNNhQZkuXyH0NBbP1kSEE2M-SYgyVqZ88IJcXY&s=pLP3udocOcAG6k_sAb9p8tcAhtOhpFm6JB7owGhPQEs&e=
>>> suggests
 it's acceptable for production.  It turns out using 4 of the built in
 datatypes of the database result in the server being unable to restart
 without clearing out the commit logs and running a repair.  That screams
 critical to me.  You shouldn't even be able to install 3.5 without the
 patch I've supplied - that bug is a ticking time bomb for anyone that
 installs it.
 
 On Wed, Sep 14, 2016 at 8:12 PM Michael Shuler 
 wrote:
 
> What's preventing the use of the 3.6 or 3.7 releases where this bug is
> already fixed? This is also fixed in the 3.0.6/7/8 releases.
> 
> Michael
> 
> On 09/14/2016 08:30 PM, Jonathan Haddad wrote:
>> Unfortunately CASSANDRA-11618 was fixed in 3.6 but was not back
>>> ported to
>> 3.5 as well, and it makes Cassandra effectively unusable if someone
>>> is
>> using any of the 4 types affected in any of their schema.
>> 
>> I have cherry picked & merged the patch back to here and will put it
>>> in a
>> JIRA as well tonight, I just wanted to get the ball rolling asap on
>>> this.
>> 
>> 
> 
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rustyrazorblade_cassandra_tree_fix-5Fcommitlog-5Fexception&d=DQIBaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=MZ9nLcNN

Re: Proposal - 3.5.1

2016-09-15 Thread Jeremiah D Jordan
n.wikipedia.org/wiki/Tick-Tock_model
>>> 
>>> The intention was to allow new features into 3.even releases (3.0, 3.2,
>>> 3.4, 3.6, and so on), with bugfixes in 3.odd releases (3.1, … ). The hope
>>> was to allow more frequent releases to address the first big negative
>>> (flood of new features that blocked releases), while also helping to
>>> address the second – with fewer major features in a release, they better
>>> get more/better test coverage.
>>> 
>>> In the tick/tock model, anyone running 3.odd (like 3.5) should be looking
>>> for bugfixes in 3.7. It’s certainly true that 3.5 is horribly broken (as
>> is
>>> 3.3, and 3.4, etc), but with this release model, the bugfix SHOULD BE in
>>> 3.7. As I mentioned previously, we have precedent for backporting
>> critical
>>> fixes, but we don’t have a well defined bar (that I see) for what’s
>>> critical enough for a backport.
>>> 
>>> Jon is noting (and what many of us who run Cassandra in production have
>>> really known for a very long time) is that nobody wants to run 3.newest
>>> (even or odd), because 3.newest is likely broken (because it’s a complex
>>> distributed database, and testing is hard, and it takes time and complex
>>> workloads to find bugs). In the tick/tock model, because new features
>> went
>>> into 3.6, there are new features that may not be adequately
>>> tested/validated in 3.7 a user of 3.5 doesn’t want, and isn’t willing to
>>> accept the risk.
>>> 
>>> The bottom line here is that tick/tock is probably a well intentioned but
>>> failed attempt to bring stability to Cassandra’s releases. The problems
>>> tick/tock was meant to solve are real problems, but tick/tock doesn’t
>> seem
>>> to be addressing them – new features invalidate old testing, which makes
>> it
>>> difficult/impossible for real users to sit on the 3.odd versions.
>>> 
>>> We’re due for cutting 3.9 and 3.0.9, and we have limited RE manpower to
>>> get those out. Only after those are out would I be +1 on a 3.5.1, and
>> then
>>> only because if I were running 3.5, and I hit this bug, I wouldn’t want
>> to
>>> spend the ~$100k it would cost my organization to validate 3.7 prior to
>>> upgrading, and I don’t think it’s reasonable to ask users to recompile a
>>> release for a ~10 line fix for a very nasty bug.
>>> 
>>> I’m also very strongly recommend we (committers/PMC) reconsider tick/tock
>>> for 4.x releases, because this is exactly the type of problem that will
>>> continue to happen as we move forward. I suggest that we either need to
>> go
>>> back to the old model and do a better job of dealing with feature creep
>> and
>>> testing, or we need to better define what gets backported, because the
>>> community needs a stable version to run, and running latest odd release
>> of
>>> tick/tock isn’t it.
>>> 
>>> - Jeff
>>> 
>>> 
>>> On 9/15/16, 10:31 AM, "dave_les...@apple.com on behalf of Dave Lester" <
>>> dave_les...@apple.com> wrote:
>>> 
>>>> How would cutting a 3.5.1 release possibly confuse users of the
>> software?
>>> It would be easy to document the change and to send release notes.
>>>> 
>>>> Given the bug’s critical nature and that it's a minor fix, I’m +1
>>> (non-binding) to a new release.
>>>> 
>>>> Dave
>>>> 
>>>>> On Sep 15, 2016, at 7:18 AM, Jeremiah D Jordan <https://urldefense.
>>> proofpoint.com/v2/url?u=http-3A__jeremiah.jordan-40gmail.com&d=DQIFaQ&c=
>>> 08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=
>>> yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=
>>> srNzKwrs8hKPoJMZ4Ao18CYaMYKnbWaCHou6ui5tqdM&s=iM_
>>> LKKIhaiC0w6uz3lhK1lob4gJbKhLPqGNfPPLye6w&e= > wrote:
>>>>> 
>>>>> I’m with Jeff on this, 3.7 (bug fixes on 3.6) has already been
>> released
>>> with the fix.  Since the fix applies cleanly anyone is free to put it on
>>> top of 3.5 on their own if they like, but I see no reason to put out a
>>> 3.5.1 right now and confuse people further.
>>>>> 
>>>>> -Jeremiah
>>>>> 
>>>>> 
>>>>>> On Sep 15, 2016, at 9:07 AM, Jonathan Haddad 
>>> wrote:
>>>>>> 
>>>>>> As I follow up, I suppose I'm only advocating for a fix to the odd
>>&

Re: sstableloader from java

2016-09-21 Thread Jeremiah D Jordan
Yes using the SSTableLoader class.  You can see the CqlBulkRecordWriter class 
for an example of writing out sstables to disk and then using the SSTableLoader 
class to stream them to a cluster.

-Jeremiah

> On Sep 21, 2016, at 7:18 PM, Paul Weiss  wrote:
> 
> Hi,
> 
> Is it possible to call the sstableloader from java instead using the
> command line program? I have a process that uses the CQLSSTableWriter and
> generates the sstable files but am looking for an end to end process that
> bulk loads without any manual intervention.
> 
> Ideally would like to avoid forking another process so I can properly check
> for errors.
> 
> Thanks,
> -paul



Re: Proprietary Replication Strategies: Cassandra Driver Support

2016-10-07 Thread Jeremiah D Jordan
What kind of support are you thinking of?  All drivers should support them 
already, drivers shouldn’t care about replication strategy except when trying 
to do token aware routing.
But since anyone can make a custom replication strategy, drivers that do token 
aware routing just need to handle falling back to not doing token aware routing 
if a replication strategy they don’t know about is in use.
All the open sources drivers I know of do this, so they should all “support” 
those strategies already.

-Jeremiah

> On Oct 7, 2016, at 1:02 PM, Prasenjit Sarkar  
> wrote:
> 
> Hi everyone,
> 
> To the best of my understanding that Datastax has proprietary replication
> strategies: Local and Everywhere which are not part of the open source
> Apache Cassandra project.
> 
> Do we know of any plans in the open source Cassandra driver community to
> support these two replication strategies? Would Datastax have a licensing
> concern if the open source driver community supported these strategies? I'm
> fairly new here and would like to understand the dynamics.
> 
> Thanks,
> Prasenjit



Re: [jira] [Commented] (CASSANDRA-12490) Add sequence distribution type to cassandra stress

2016-10-13 Thread Jeremiah D Jordan
I would guess Jake replied to the JIRA message that was sent to commits@ 
expected the reply to end up going back to the ticket (which happens if you 
reply to something JIRA sends directly to you because you watched the ticket), 
but instead went to dev@ because the emails sent to commits@ from JIRA have a 
reply to of dev@, not back to the ticket.

-Jeremiah

> On Oct 13, 2016, at 7:40 PM, Ben Slater  wrote:
> 
> OK, I think it’s pretty unlikely to be this change as I didn’t change the
> existing code (certainly nothing near what is used by -pop) and also I just
> noticed you said you had the issue in 3.9 and CASS-12490 is destined for
> 3.10.
> 
> Also, last time I looked, I thought stress didn’t validate returned results
> for YAML specs. Did I miss something or did that get added recently? Can
> you add your actual command, etc to the ticket?
> 
> Anyway, I will try to do some more digging over the weekend as I still
> suspect there is something wrong (or at least unexpected) going on aside
> from this change.
> 
> (BTW - I noticed you moved the discussion from JIRA to the dev list. What’s
> the etiquette there?)
> 
> Cheers
> Ben
> 
> 
> 
> On Fri, 14 Oct 2016 at 09:02 Jake Luciani  wrote:
> 
>> No I'm not using a seq anywhere else then the command line
>> 
>> On Oct 13, 2016 4:40 PM, "Ben Slater (JIRA)"  wrote:
>> 
>>> 
>>>[ https://issues.apache.org/jira/browse/CASSANDRA-12490?
>>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&
>>> focusedCommentId=15573119#comment-15573119 ]
>>> 
>>> Ben Slater commented on CASSANDRA-12490:
>>> 
>>> 
>>> Just to check [~tjake] when you say "this also breaks validation", I
>>> assume you mean it breaks validation when you use the sequence
>> distribution
>>> type, not in the case where you don't use seq()?
>>> 
 Add sequence distribution type to cassandra stress
 --
 
Key: CASSANDRA-12490
URL: https://issues.apache.org/
>>> jira/browse/CASSANDRA-12490
Project: Cassandra
 Issue Type: Improvement
 Components: Tools
   Reporter: Ben Slater
   Assignee: Ben Slater
   Priority: Minor
Fix For: 3.10
 
Attachments: 12490-trunk.patch, 12490.yaml,
>>> cqlstress-seq-example.yaml
 
 
 When using the write command, cassandra stress sequentially generates
>>> seeds. This ensures generated values don't overlap (unless the sequence
>>> wraps) providing more predictable number of inserted records (and
>>> generating a base set of data without wasted writes).
 When using a yaml stress spec there is no sequenced distribution
>>> available. It think it would be useful to have this for doing initial
>> load
>>> of data for testing
>>> 
>>> 
>>> 
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.3.4#6332)
>>> 
>> 
> -- 
> 
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798



Re: Low hanging fruit crew

2016-10-19 Thread Jeremiah D Jordan
Unless the reviewer reviews the for content, then you don’t know if they do or 
not.

-Jeremiah

> On Oct 19, 2016, at 10:52 AM, Jonathan Haddad  wrote:
> 
> Shouldn't the tests test the code for correctness?
> 
> On Wed, Oct 19, 2016 at 8:34 AM Jonathan Ellis  wrote:
> 
>> On Wed, Oct 19, 2016 at 8:27 AM, Benjamin Lerer <
>> benjamin.le...@datastax.com
>>> wrote:
>> 
>>> Having the test passing does not mean that a patch is fine. Which is why
>> we
>>> have a review check list.
>>> I never put a patch available without having the tests passing but most
>> of
>>> my patches never pass on the first try. We always make mistakes no matter
>>> how hard we try.
>>> The reviewer job is to catch those mistakes by looking at the patch from
>>> another angle. Of course, sometime, both of them fail.
>>> 
>> 
>> Agreed.  Review should not just be a "tests pass, +1" rubber stamp, but
>> actually checking the code for correctness.  The former is just process;
>> the latter actually catches problems that the tests would not.  (And this
>> is true even if the tests are much much better than ours.)
>> 
>> --
>> Jonathan Ellis
>> co-founder, http://www.datastax.com
>> @spyced
>> 



Re: Proposal - 3.5.1

2016-10-20 Thread Jeremiah D Jordan
My thinking was we keep doing tick/tock for the 4.x.  Basically continue on for 
4.0.x / 4.x like we have been with 3.0.x / 3.x, just with some added guidance 
to people that 4.x is “development releases”.  The main problem I hear with the 
tick tock stuff is that we won’t ever have “LTS” branches any more.  So lets 
change that and make the .0 releases LTS branches.

-Jeremiah

> On Oct 20, 2016, at 4:42 PM, Jeff Jirsa  wrote:
> 
> 
> 
> On 2016-10-20 14:21 (-0700), Jeremiah Jordan  wrote: 
>> In the original tick tock plan we would not have kept 4.0.x around.  So I am 
>> proposing a change for that and then we label the 3.x and 4.x releases as 
>> "development releases" or some other thing and have "yearly" LTS releases 
>> with .0.x.
>> Those are similar to the previous 1.2/2.0/2.1/2.2 and we are adding semi 
>> stable development releases as well which give people an easier way to try 
>> out new stuff than "build it yourself", which was the only way to do that in 
>> between the previous Big Bang releases.
>> 
> 
> This sounds reasonable to me. Would 4.(even) still be features and 4.(odd) 
> still be stability fixes? Or everything in 4.x is features and/or stability? 
> 



Re: Proposals for releases - 4.0 and beyond

2016-11-18 Thread Jeremiah D Jordan
I think the monthly releases are important, otherwise releases become an 
“event”.  The monthly releases mean they are just a normal thing that happens.  
So I like any of 3/4/5.

Sylvain's proposal sounds interesting to me.  My only concern would be with 
making sure we label things very clearly so that users understand which branch 
is the current “stable” branch.  The switch from “testing” to “stable” seems 
like a place that could cause confusion, but as long as we label everything 
well I think we can handle it.

That being said, I think it would be a good addition to the current model 
making the “wait for .6 for stable” more explicit in the release plan, since 
this is what many users already do on their own.

So I think I like Option 3 most of them.  It keeps a monthly cadence, it makes 
explicit which branch we think is stable, and it keeps the number of active 
branches manageable.

-Jeremiah


> On Nov 18, 2016, at 5:49 PM, Jeff Jirsa  wrote:
> 
> With 3.10 voting in progress (take 3), 3.11 in December/January (probably?), 
> we should solidify the plan for 4.0.
> 
> I went through the archives and found a number of proposals. We (PMC) also 
> had a very brief chat in private to make sure we hadn’t missed any, and here 
> are the proposals that we’ve seen suggested. 
> 
> Option #1: Jon proposed [1] a feature release every 3 months and bugfixes for 
> 6 months after that.
> Option #2: Mick proposed [2] bimonthly feature, semver, labelling release 
> with stability/quality during voting, 3 GA branches at a time. 
> Option #3: Sylvain proposed [3] feature / testing / stable branches, Y 
> cadence for releases, X month rotation from feature -> testing -> stable -> 
> EOL (X to be determined). This is similar to an Ubuntu/Debian like release 
> schedule – I asked Sylvain for an example just to make sure I understood it, 
> and I’ve copied that to github at [4].
> Option #4: Jeremiah proposed [5] keeping monthly cadence, and every 12 months 
> break off X.0.Y which becomes LTS (same as 3.0.x now). This explicitly 
> excludes alternating tick/tock feature/bugfix for the monthly cadence on the 
> newest/feature/4.x branch. 
> Option #5: Jason proposed a revision to Jeremiah’s proposal such that 
> releases to the LTS branches are NOT tied to a monthly cadence, but are 
> released “as needed”, and the LTS branches are also “as needed”, not tied to 
> a fixed (annual/semi-annual/etc) schedule. 
> 
> Please use this thread as an opportunity to discuss these proposals or feel 
> free to make your own proposals. I think it makes sense to treat this like a 
> nomination phase of an election – let’s allow at least 72 hours for 
> submitting and discussing proposals, and then we’ll open a vote after that.
>   
> - Jeff
> 
> [1]: 
> https://lists.apache.org/thread.html/0b2ca82eb8c1235a4e44a406080729be78fb539e1c0cbca638cfff52@%3Cdev.cassandra.apache.org%3E
> [2]: 
> https://lists.apache.org/thread.html/674ef1c02997041af4b8950023b07b2f48bce3b197010ef7d7088662@%3Cdev.cassandra.apache.org%3E
> [3]: 
> https://lists.apache.org/thread.html/fcc4180b7872be4db86eae12b538eef34c77dcdb5b13987235c8f2bd@%3Cdev.cassandra.apache.org%3E
> [4]: https://gist.github.com/jeffjirsa/9bee187246ca045689c52ce9caed47bf
> [5]: 
> https://lists.apache.org/thread.html/0a3372b2f2b30fbeac04f7d5a214b203b18f3d69223e7ec9efb64776@%3Cdev.cassandra.apache.org%3E
> 
> 
> 
> 



Re: Collecting slow queries

2016-12-06 Thread Jeremiah D Jordan
Per the Fix Version on the ticket it is going to be in 3.10 when that is 
released.  Probably in the next week provided we don’t find any more show 
stopper bugs.

-Jeremiah

> On Dec 6, 2016, at 10:38 AM, Jan  wrote:
> 
> Hello Yoshi-san; 
> is this fix rolled into  Cassandra 3.7.0 ?  I do not see it in the 
> cassandra.yaml file.
> Is there anything special that needs to be downloaded for this feature to 
> show up that I am missing in  Cassandra 3.7.0.
> Thank you for your prompt response earlier, Jan
> 
> 
>On Monday, December 5, 2016 3:42 PM, Yoshi Kimoto  
> wrote:
> 
> 
> This? : https://issues.apache.org/jira/browse/CASSANDRA-12403
> 
> 2016-12-06 6:36 GMT+09:00 Jeff Jirsa :
> 
>> Should we reopen 6226? Tracing 0.1% doesn’t help find the outliers that
>> are slow but don’t time out (slow query log could help find large
>> partitions for users with infrequent but painful large partitions, far
>> easier than dumping sstables to json to identify them).
>> 
>> 
>> On 12/5/16, 1:28 PM, "sankalp kohli"  wrote:
>> 
>>> This is duped by a JIRA which is fixed in 3.2
>>> 
>>> https://issues.apache.org/jira/browse/CASSANDRA-6226
>>> 
>>> On Mon, Dec 5, 2016 at 12:15 PM, Jan  wrote:
>>> 
 HI Folks;
 is there a way for 'Collecting slow queries'  in the Apache Cassandra.
>> ?I
 am aware of the DSE product offering such an option, but need the
>> solution
 on Apache Cassandra.
 ThanksJan
>> 
> 



Re: Perf regression between 2.2.5 and 3.11

2017-01-19 Thread Jeremiah D Jordan
You may be getting perf issues from message coalescing depending on what CL you 
are testing with:
https://issues.apache.org/jira/browse/CASSANDRA-12676

Try you tests with:
otc_coalescing_strategy: DISABLED

> On Jan 19, 2017, at 4:28 PM, Andrew Whang  wrote:
> 
> Hi,
> 
> I'm seeing perf regressions (using cassandra-stress) between 2.2.5 and
> 3.11. I understand these versions are quite far apart, but just wondering
> if there are stress results publicly available that compare 2.x to 3.x?
> Thanks.



Re: Way to unsubscribe from mailing lists

2017-04-27 Thread Jeremiah D Jordan
It already sets those:

List-Help: 
List-Unsubscribe: 
List-Post: 
List-Id: 


> On Apr 27, 2017, at 10:43 AM, Eric Evans  wrote:
> 
> On Wed, Apr 26, 2017 at 11:16 AM, Jake Luciani  wrote:
>> Another option would be to add a unsubscribe header, not sure if we already
>> do but I think that causes gmail/outlook to add a unsubscribe button
>> 
>> http://www.list-unsubscribe.com/
> 
> Brilliant.  Sad that it would come to this, but brilliant.
> 
> -- 
> Eric Evans
> john.eric.ev...@gmail.com
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 



Re: Integrating vendor-specific code and developing plugins

2017-05-15 Thread Jeremiah D Jordan

> On May 15, 2017, at 6:57 PM, 大平怜  wrote:
> 
> Thanks for the discussion, all,
> 
>>> * What's included when shipped in tree?
>>> 
>>> Does every idea get merged in? Do we need 30 different Seed providers?  Who
>>> judges what's valuable enough to add?  Who maintains it when it needs
>>> updating?  If the maintainer can't be found, is it removed?  Shipped
>>> broken?  Does the contributed plugins go through the same review process?
>>> Do the contributors need to be committers?  Would CASSANDRA-12627 be merged
>>> in even if nobody saw the value?
> 
> If the rule is to never merge a feature that is pluggable, then it
> would be easy to make a decision, but if not, then we must anyway ask
> ourselves these questions every time a feature is proposed, and I
> think most of the questions are not specific to plugins but generic
> for any new proposals.

Agreed.  Include or not is the same decision we make for any JIRA.

> 
> 
>> In accordance with the idea that the codebase should be better tested, it
>> seems to me like things shouldn't be added that aren't testable.  If
>> there's a million unit tests that are insanely comprehensive but for some
>> reason can never be run, they serve exactly the same value as no tests.
> 
> I think we need to define what the "testable" means.  Does it mean at
> least one of the committers has full access to an environment where
> she can run the tests?  Also, does it mean the environment is
> integrated to the CI?

To me testable means that we can run the tests at the very least for every 
release, but ideally they would be run more often than that.  Especially with 
the push to not release unless the test board is all passing, we should not be 
releasing features that we don’t have a test board for.  Ideally that means we 
have it in ASF CI.  If there is someone that can commit to posting results of 
runs from an outside CI somewhere, then I think that could work as well, but 
that gets pretty cumbersome if we have to check 10 different CI dashboards at 
different locations before every release.

-Jeremiah
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Potential block issue for 3.0.13: schema version id mismatch while upgrading

2017-05-30 Thread Jeremiah D Jordan
If 3.0.13 causes schema mismatch on upgrade, then maybe we should pull that and 
release 3.0.14 once 13559 is fixed.  As that is a pretty bad place to get into.


> On May 30, 2017, at 6:39 PM, Jay Zhuang  wrote:
> 
> Seems the mail is marked as spam. So try forwarding with another email
> account.
> 
> Thanks,
> Jay
> 
> -- Forwarded message --
> From: Jay Zhuang 
> Date: Tue, May 30, 2017 at 2:22 PM
> Subject: Potential block issue for 3.0.13: schema version id mismatch while
> upgrading
> To: dev@cassandra.apache.org
> 
> 
> Hi,
> 
> While upgrading to 3.0.13 we found that the schema id is changed for the
> same schema. Which could cause cassandra unable to start and other issues
> related to UnknownColumnFamilyException. Ticket: CASSANDRA-13559
> 
> The problem is because the order of SchemaKeyspace tables is changed. Then
> the digest for the same schema is also changed:
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/
> java/org/apache/cassandra/schema/SchemaKeyspace.java#L311
> 
> I would suggest to have the older list back for digest calculation. But it
> also means 3.0.13 -> 3.0.14 upgrade will have the same problem. Any
> suggestion on that?
> 
> Thanks,
> Jay
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Is concurrent_batchlog_writes option used/implemented?

2017-06-15 Thread Jeremiah D Jordan
The project hosted docs can be found here:
http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html 


If you find something wrong in those open a JIRA.

DataStax has a documentation feedback page here if you want to contact their 
documentation team: 
http://docs.datastax.com/en/landing_page/doc/landing_page/contact.html 


-Jeremiah


> On Jun 15, 2017, at 11:22 AM, Jason Brown  wrote:
> 
> Hey Tomas,
> 
> Thanks for finding these errors. Unfortunately, those are problems on the
> Datastax-hosted documentation, not the docs hosted by the Apache project.
> To fix those problems you should contact Datastax (I don't have a URL handy
> rn, but if one of the DS folks who follow this list can add one that would
> be great).
> 
> I can't look right now, but do we have similar documentation on the Apache
> docs?
> 
> Thanks,
> 
> Jason
> 
> On Thu, Jun 15, hose2017 at 01:46 Tomas Repik  wrote:
> 
>> And yet another glitch in the ob at:
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__cqlTruncateequest_timeout_in_ms
>> 
>> I guess it should be truncate_timeout_in_ms instead.
>> 
>> Is there a more proper way I should use to report these kind of issues? If
>> yes, thanks for giving any directions.
>> 
>> Tomas
>> 
>> - Original Message -
>>> Thanks for information I thought this would be the case ...
>>> 
>>> I found another option that is not documented properly:
>>> allocate_tokens_for_local_replication_factor [1] option is not found in
>> any
>>> config file instead the allocate_tokens_for_keyspace option is present. I
>>> guess it is the replacement for the former but I can't see it documented
>>> anywhere. Thanks for clarification.
>>> 
>>> Tomas
>>> 
>>> [1]
>>> 
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__allocate_tokens_for_local_replication_factor
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 



Re: CASSANDRA-9472 Reintroduce off heap memtables - patch to 3.0

2017-07-31 Thread Jeremiah D Jordan

> On Jul 31, 2017, at 12:17 PM, Jeff Jirsa  wrote:
> On 2017-07-29 10:02 (-0700), Jay Zhuang  wrote: 
>> Should we consider back-porting it to 3.0 for the community? I think
>> this is a performance regression instead of new feature. And we have the
>> feature in 2.1, 2.2.
>> 
> 
> Personally / individually, I'd much rather see 3.0 stabilize.

+1.  The feature is there in 3.11.x if you are running one of the use cases 
where this helps, and for most existing things 3.0 and 3.11 are about the same 
stability, so you can go to 3.11.x if you want to keep using the off heap stuff.

-Jeremiah
-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Compact Storage and SuperColumn Tables in 4.0/trunk

2017-09-19 Thread Jeremiah D Jordan
I think that all the work to support Compact Storage tables from CQL seems like 
wasted effort if we are going to tell people “just kidding, you have to migrate 
all your data”.  I do not think supporting “COMPACT STORAGE” as a table option 
matters one way or the other.  But I do think being able to read the data that 
was in a table created that way is something we need to have a path forward for.

> since thrift is not supported on trunk/4.0, it makes it much less appealing 
> or even necessary

I think that the fact thrift is not supported on trunk/4.0 makes accessing said 
data from CQL *MORE* necessary and appealing.

> possibility drop a Compact Storage flag and expose them as “normal" tables, 
> there was an idea of removing the Compact Tables from 4.x altogether. 

If we provide a way to drop the flag, but still access the data, I think that 
is fine and perfectly reasonable.  If the proposal here is that users who have 
data in COMPACT STORAGE tables have no way to upgrade to 4.0 and still access 
that data without exporting it to a brand new table, then I am against it.  Can 
you clarify which thing is being proposed?  It is not clear to me.

-Jeremiah


> On Sep 19, 2017, at 7:10 AM, Oleksandr Petrov  
> wrote:
> 
> As you may know, SuperColumn Tables did not work in 3.x the way they worked 
> in 2.x. In order to provide everyone with a reasonable upgrade path, we've 
> been working on CASSANDRA-12373[1], that brings in support for SuperColumn 
> tables as close to 2.x as possible. The patch is planned to land 
> cassandra-3.0 and cassandra-3.11 branches only, since the patch for trunk 
> will require even more work and, since thrift is not supported on trunk/4.0, 
> it makes it much less appealing or even necessary. The idea behind the 
> support for SuperColumns was always only to allow people to smoothly migrate 
> off them in 3.0/3.11 world, not to have them as a primary feature.
> 
> SuperColumns are not the only type of Compact Table, there are more. After 
> CASSANDRA-8099[2], Compact Tables are special-cased and have special schema 
> layout with some columns hidden from CQL, that allows them to be used from 
> Thrift. But, except for the fact they’re accessible from Thrift, there are no 
> advantages to use them with the new storage. In order to allow people to 
> “expose” the internal structure of the compact tables to make them fully 
> accessible in CQL, CASSANDRA-10857[3] was created.
> 
> In the light of the fact that 4.0 will not have reasonable SuperColumn 
> support (due to related complexity and amount of special-cases required to 
> support it in 4.0) and a possibility drop a Compact Storage flag and expose 
> them as “normal" tables, there was an idea of removing the Compact Tables 
> from 4.x altogether. 
> 
> 
> Leaving Compact Storage in 3.x only will make the table metadata a bit 
> lighter and allow us to remove some special cases required for their support. 
> Doing it during the major release, provided with a reasonable upgrade path 
> (same functionality from both Thrift and CQL for all compact tables, 
> including Super Column ones) through 3.x/3.11, sounds like the best option 
> that we have right now.
> 
> It’d be good if you could voice your support of this idea (or raise possible 
> concerns, if there are any).
> 
> 
> There will be additional discussion and a proposal on how to allow “online” 
> COMPACT STORAGE flag drop in CASSANDRA-10857 later this (or the following 
> week).
> 
> Best Regards, 
> Alex
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-12373 
> 
> [2] https://issues.apache.org/jira/browse/CASSANDRA-8099 
> 
> [3] https://issues.apache.org/jira/browse/CASSANDRA-10857 
> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Jeremiah D Jordan
Hindsight is 20/20.  For 8099 this is the reason we cut the 2.2 release before 
8099 got merged.

But moving forward with where we are now, if we are going to start adding some 
experimental flags to things, then I would definitely put SASI on this list as 
well.

For both SASI and MV I don’t know that adding a flags in the cassandra.yaml 
which prevents their use is the right way to go.  I would propose that we emit 
WARN from the native protocol mechanism when a user does an ALTER/CREATE what 
ever that tries to use an experiment feature, and probably in the system.log as 
well.  So someone who is starting new development using them will get a warning 
showing up in cqlsh “hey the thing you just used is experimental, proceed with 
caution” and also in their logs.

These things are live on clusters right now, and I would not want someone to 
upgrade their cluster to a new *patch* release and suddenly something that may 
have been working for them now does not function.  Anyway, we need to be 
careful about how this gets put into practice if we are going to do it 
retroactively.

-Jeremiah


> On Oct 1, 2017, at 5:36 PM, Josh McKenzie  wrote:
> 
>> 
>> I think committing 8099, or at the very least, parts of it, behind an
>> experimental flag would have been the right thing to do.
> 
> With a major refactor like that, it's a staggering amount of extra work to
> have a parallel re-write of core components of a storage engine accessible
> in parallel to the major based on an experimental flag in the same branch.
> I think the complexity in the code-base of having two such channels in
> parallel would be an altogether different kind of burden along with making
> the work take considerably longer. The argument of modularizing a change
> like that, however, is something I can get behind as a matter of general
> principle. As we discussed at NGCC, the amount of static state in the C*
> code-base makes this an aspirational goal rather than a reality all too
> often, unfortunately.
> 
> Not looking to get into the discussion of the appropriateness of 8099 and
> other major refactors like it (nio MessagingService for instance) - but
> there's a difference between building out new features and shielding the
> code-base and users from their complexity and reliability and refactoring
> core components of the code-base to keep it relevant.
> 
> On Sun, Oct 1, 2017 at 5:01 PM, Dave Brosius  wrote:
> 
>> triggers
>> 
>> 
>> On 10/01/2017 11:25 AM, Jeff Jirsa wrote:
>> 
>>> Historical examples are anything that you wouldn’t bet your job on for
>>> the first release:
>>> 
>>> Udf/uda in 2.2
>>> Incremental repair - would have yanked the flag following 9143
>>> SASI - probably still experimental
>>> Counters - all sorts of correctness issues originally, no longer true
>>> since the rewrite in 2.1
>>> Vnodes - or at least shuffle
>>> CDC - is the API going to change or is it good as-is?
>>> CQL - we’re on v3, what’s that say about v1?
>>> 
>>> Basically anything where we can’t definitively say “this feature is going
>>> to work for you, build your product on it” because companies around the
>>> world are trying to make that determination on their own, and they don’t
>>> have the same insight that the active committers have.
>>> 
>>> The transition out we could define as a fixed number of releases or a dev@
>>> vote, I don’t think you’ll find something that applies to all experimental
>>> features, so being flexible is probably the best bet there
>>> 
>>> 
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Jeremiah D Jordan
> Only emitting a warning really reduces visibility where we need it: in the 
> development process.

How does emitting a native protocol warning reduce visibility during the 
development process?  If you run CREATE MV and cqlsh then prints out a giant 
warning statement about how it is an experimental feature I think that is 
pretty visible during development?

I guess I can see just blocking new ones without a flag set, but we need to be 
careful here.  We need to make sure we don’t cause a problem for someone that 
is using them currently, even with all the edge cases issues they have now.

-Jeremiah


> On Oct 2, 2017, at 2:01 PM, Blake Eggleston  wrote:
> 
> Yeah, I'm not proposing that we disable MVs in existing clusters.
> 
> 
> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (alek...@apple.com) 
> wrote:
> 
> The idea is to check the flag in CreateViewStatement, so creation of new MVs 
> doesn’t succeed without that flag flipped.  
> 
> Obviously, just disabling existing MVs working in a minor would be silly.  
> 
> As for the warning - yes, that should also be emitted. Unconditionally.  
> 
> —  
> AY  
> 
> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (jeremiah.jor...@gmail.com) 
> wrote:  
> 
> These things are live on clusters right now, and I would not want someone to 
> upgrade their cluster to a new *patch* release and suddenly something that 
> may have been working for them now does not function. Anyway, we need to be 
> careful about how this gets put into practice if we are going to do it 
> retroactively. 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: [VOTE] Release Apache Cassandra 3.11.1

2017-10-02 Thread Jeremiah D Jordan
Jeff,

TL;DR 3.11.0 shows it for them as well.  See 
https://issues.apache.org/jira/browse/CASSANDRA-13900 
 for the rest of the 
story.

-Jeremiah

> On Oct 2, 2017, at 4:47 PM, Jeff Jirsa  wrote:
> 
> Thomas, did you see this on 3.11.0 as well, or have you not tried 3.11.0 (I
> know you probably want fixes from 3.11.1, but let's just clarify that this
> is or is not a regression).
> 
> If it's not a regression, we should ship this and then hopefully we'll spin
> a 3.11.2 as soon as this is fixed.
> 
> If it is a regression, I'll flip my vote to -1.
> 
> 
> 
> On Mon, Oct 2, 2017 at 1:29 PM, Steinmaurer, Thomas <
> thomas.steinmau...@dynatrace.com> wrote:
> 
>> Jon,
>> 
>> please see my latest comment + attached screen from our monitoring here:
>> https://issues.apache.org/jira/browse/CASSANDRA-13754?
>> focusedCommentId=16188758&page=com.atlassian.jira.
>> plugin.system.issuetabpanels:comment-tabpanel#comment-16188758
>> 
>> Thanks,
>> Thomas
>> 
>> -Original Message-
>> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon
>> Haddad
>> Sent: Montag, 02. Oktober 2017 22:09
>> To: dev@cassandra.apache.org
>> Subject: Re: [VOTE] Release Apache Cassandra 3.11.1
>> 
>> You’re saying the same memory leak happens under 3.11?
>> 
>>> On Oct 2, 2017, at 1:04 PM, Aleksey Yeshchenko 
>> wrote:
>>> 
>>> Thomas,
>>> 
>>> I would maybe agree with waiting for a while because of it, if we had a
>> proper fix at least under review - or in progress by someone.
>>> 
>>> But this is not a regression, and there’s been a lot of fixes
>> accumulated and not released yet. Arguable worse to hold them back :\
>>> 
>>> —
>>> AY
>>> 
>>> On 2 October 2017 at 20:54:38, Steinmaurer, Thomas (
>> thomas.steinmau...@dynatrace.com) wrote:
>>> 
>>> Jeff,
>>> 
>>> even if it is not a strict regression, this currently forces us to do a
>> rolling restart every ~ 72hrs to be on the safe-side with -Xmx8G. Luckily
>> this is just a loadtest environment. We don't have 3.11 in production yet.
>>> 
>>> I can offer to immediately deploy a snapshot build into our loadtest
>> environment, in case this issue gets attention and a fix needs verification
>> at constant load.
>>> 
>>> Thanks,
>>> Thomas
>>> 
>>> -Original Message-
>>> From: Jeff Jirsa [mailto:jji...@gmail.com]
>>> Sent: Montag, 02. Oktober 2017 20:04
>>> To: Cassandra DEV 
>>> Subject: Re: [VOTE] Release Apache Cassandra 3.11.1
>>> 
>>> +1
>>> 
>>> ( Somewhat concerned that
>>> https://issues.apache.org/jira/browse/CASSANDRA-13754 may not be fixed,
>> but it's not a strict regression ? )
>>> 
>>> 
>>> 
>>> On Mon, Oct 2, 2017 at 10:58 AM, Michael Shuler 
>>> wrote:
>>> 
 I propose the following artifacts for release as 3.11.1.
 
 sha1: 983c72a84ab6628e09a78ead9e20a0c323a005af
 Git:
 http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=
 shortlog;h=refs/tags/3.11.1-tentative
 Artifacts:
 https://repository.apache.org/content/repositories/
 orgapachecassandra-1151/org/apache/cassandra/apache-cassandra/3.11.1/
 Staging repository:
 https://repository.apache.org/content/repositories/
 orgapachecassandra-1151/
 
 The Debian packages are available here:
 http://people.apache.org/~mshuler
 
 The vote will be open for 72 hours (longer if needed).
 
 [1]: (CHANGES.txt) https://goo.gl/dZCRk8
 [2]: (NEWS.txt) https://goo.gl/rh24MX
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org
 
 
>>> The contents of this e-mail are intended for the named addressee only.
>> It contains information that may be confidential. Unless you are the named
>> addressee or an authorized designee, you may not copy or use it, or
>> disclose it to anyone else. If you received it in error please notify us
>> immediately and then destroy it. Dynatrace Austria GmbH (registration
>> number FN 91482h) is a company registered in Linz whose registered office
>> is at 4040 Linz, Austria, Freistädterstraße 313
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> The contents of this e-mail are intended for the named addressee only. It
>> contains information that may be confidential. Unless you are the named
>> addressee or an authorized designee, you may not copy or use it, or
>> disclose it to anyone else. If you received it in error please notify us
>> immediately and then destroy it. Dynatrace Austria GmbH (registratio

Re: Proposal to retroactively mark materialized views experimental

2017-10-02 Thread Jeremiah D Jordan
t;>>> 
>>>> --> I find this pretty extreme. Now we have an existing feature sitting  
>>>> there in the base code but forbidden from version xxx onward.  
>>>> 
>>>> Since when do we start removing feature in a patch release ?  
>> (forbidding  
>>> to  
>>>> create new MV == removing the feature, defacto)  
>>>> 
>>>> Even the Thrift protocol has gone through a long process of deprecation  
>>> and  
>>>> will be removed on 4.0  
>>>> 
>>>> And if we start opening the Pandora box like this, what's next ?  
>>> Forbidding  
>>>> to create SASI index too ? Removing Vnodes ?  
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <  
>>> jeremiah.jor...@gmail.com  
>>>>> wrote:  
>>>> 
>>>>>> Only emitting a warning really reduces visibility where we need it:  
>> in  
>>>>> the development process.  
>>>>> 
>>>>> How does emitting a native protocol warning reduce visibility during  
>> the  
>>>>> development process? If you run CREATE MV and cqlsh then prints out a  
>>>>> giant warning statement about how it is an experimental feature I  
>> think  
>>>>> that is pretty visible during development?  
>>>>> 
>>>>> I guess I can see just blocking new ones without a flag set, but we  
>> need  
>>>>> to be careful here. We need to make sure we don’t cause a problem for  
>>>>> someone that is using them currently, even with all the edge cases  
>>> issues  
>>>>> they have now.  
>>>>> 
>>>>> -Jeremiah  
>>>>> 
>>>>> 
>>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston   
>>>>> wrote:  
>>>>>> 
>>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.  
>>>>>> 
>>>>>> 
>>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (  
>>> alek...@apple.com)  
>>>>> wrote:  
>>>>>> 
>>>>>> The idea is to check the flag in CreateViewStatement, so creation of  
>>> new  
>>>>> MVs doesn’t succeed without that flag flipped.  
>>>>>> 
>>>>>> Obviously, just disabling existing MVs working in a minor would be  
>>> silly.  
>>>>>> 
>>>>>> As for the warning - yes, that should also be emitted.  
>> Unconditionally.  
>>>>>> 
>>>>>> —  
>>>>>> AY  
>>>>>> 
>>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (  
>>>>> jeremiah.jor...@gmail.com) wrote:  
>>>>>> 
>>>>>> These things are live on clusters right now, and I would not want  
>>>>> someone to upgrade their cluster to a new *patch* release and suddenly  
>>>>> something that may have been working for them now does not function.  
>>>>> Anyway, we need to be careful about how this gets put into practice if  
>>> we  
>>>>> are going to do it retroactively.  
>>>>> 
>>>>> 
>>>>> -  
>>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org  
>>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org  
>>>>> 
>>>>> 
>>> 
>>> 
>>> -  
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org  
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org  
>>> 
>>> 
>> 



Re: Proposal to retroactively mark materialized views experimental

2017-10-03 Thread Jeremiah D Jordan
Thanks for bringing this up Kurt, it is a fair point.  Given the work that 
Paulo and Zhao have done to get MV’s in shape, what are the outstanding issues 
that would warrant making them experimental?



> On Oct 3, 2017, at 5:56 AM, kurt greaves  wrote:
> 
> And finally, back onto the original topic. I'm not convinced that MV's need
> this treatment now. Zhao and Paulo (and others+reviewers) have made quite a
> lot of fixes, granted there are still some outstanding bugs but the
> majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
> bugs mostly only affect views with a poor data model. Plus we've already
> required the known broken components require a flag to be turned on. Also
> at this point it's not worth making them experimental because a lot of
> users are already using them, it's a bit late to go and do that. We should
> just continue to try and fix them, or where not possible clearly document
> use cases that should be avoided.



Re: Proposal to retroactively mark materialized views experimental

2017-10-03 Thread Jeremiah D Jordan
So for some perspective here, how do users who do not get the guarantees of 
MV’s implement this on their own?  They used logged batches.

Pseudo CQL here, but you should get the picture:

If they don’t ever update data, they do it like so, and it is pretty safe:
BEGIN BATCH
INSERT tablea blah
INSERT tableb blahview
END BATCH

If they do update data, they likely do it like so, and get it wrong in the face 
of concurrency:
SELECT * from tablea WHERE blah;

BEGIN BATCH
INSERT tablea blah
INSERT tableb blahview
DELETE tableb oldblahview
END BATCH

A sophisticated user that understands the concurrency issues may well try to 
implement it like so:

SELECT key, col1, col2 FROM tablea WHERE key=blah;

BEGIN BATCH
UPDATE tablea col1=new1, col2=new2 WHERE key=blah IF col1=old1 and col2=old2
UPDATE tableb viewc1=new2, viewc2=blah WHERE key=new1
DELETE tableb WHERE key=old1
END BATCH

And it wouldn’t work because you can only use LWT in a BATCH if all updates 
have the same partition key value, and the whole point of a view most of the 
time is that it doesn't (and there are other issues with this, like most likely 
needing to use uuid’s or something else to distinguish between concurrent 
updates, that are not realized until it is too late).

A user who does not dig in and understand how MV’s work, most likely also does 
not dig in to understand the trade offs and draw backs of logged batches to 
multiple tables across different partition keys.  Or even necessarily of read 
before writes, and concurrent updates and the races inherent in them.  I would 
guess that using MV’s, even as they are today is *safer* for these users than 
rolling their own.  I have seen these patterns implemented by people many 
times, including the “broken in the face of concurrency” version.  So lets 
please not try to argue that a casual user that does not dig in to the 
specifics of feature A is going dig in and understand the specifics of any 
other features.  So yes, I would prefer my bank to use MV’s as they are today 
over rolling their own, and getting it even more wrong.

Now, even given all that, if we want to warn users of the pit falls of using 
MV’s, then lets do that.  But lets keep some perspective on how things actually 
get used.

-Jeremiah

> On Oct 3, 2017, at 8:12 PM, Benedict Elliott Smith <_...@belliottsmith.com> 
> wrote:
> 
> While many users may apparently be using MVs successfully, the problem is how 
> few (if any) know what guarantees they are getting.  Since we aren’t even 
> absolutely certain ourselves, it cannot be many.  Most of the shortcomings we 
> are aware of are complicated, concern failure scenarios and aren’t fully 
> explained; i.e. if you’re lucky they’ll never be a problem, but some users 
> must surely be bitten, and they won’t have had fair warning.  The same goes 
> for as-yet undiscovered edge cases.
> 
> It is my humble opinion that averting problems like this for just a handful 
> of users, that cannot readily tolerate corruption, offsets any inconvenience 
> we might cause to those who can.
> 
> For the record, while it’s true that detecting inconsistencies is as much of 
> a problem for user-rolled solutions, it’s worth remembering that the 
> inconsistencies themselves are not equally likely:
> 
> In cases where C* is not the database of record, it is quite easy to provide 
> very good consistency guarantees when rolling your own
> Conversely, a global-CAS with synchronous QUORUM updates that are retried 
> until success, while much slower, also doesn’t easily suffer these 
> consistency problems, and is the naive approach a user might take if C* were 
> the database of record
> 
> Given our approach isn’t uniformly superior, I think we should be very 
> cautious about how it is made available until we’re very confident in it, and 
> we and the community fully understand it.
> 
> 
>> On 3 Oct 2017, at 18:51, kurt greaves  wrote:
>> 
>> Lots of users are already using MV's, believe it or not in some cases quite
>> effectively and also on older versions which were still exposed to a lot of
>> the bugs that cause inconsistencies. 3.11.1 has come a long way since then
>> and I think with a bit more documentation around the current issues marking
>> MV's as experimental is unnecessary and likely annoying for current users.
>> On that note we've already had complaints about changing defaults and
>> behaviours willy nilly across majors and minors, I can't see this helping
>> our cause. Sure, you can make it "seamless" from an upgrade perspective,
>> but that doesn't account for every single way operators do things. I'm sure
>> someone will express surprise when they run up a new cluster or datacenter
>> for testing with default config and find out that they have to enable MV's.
>> Meanwhile they've been using them the whole time and haven't had any major
>> issues because they didn't touch the edge cases.
>> 
>> I'd like to point out that introducing "experimental" features sets a
>> precedent f

Re: V5 as a protocol beta version in 3.11

2017-11-07 Thread Jeremiah D Jordan
My 2 cents.  When we added V5 to 3.x wasn’t it added as a beta protocol for 
tick/tock stuff and known that when a new version came out it would most 
possibly break the older releases V5 beta stuff? Or at the very least add new 
things to V5.  So I see no reason to need to add more new features to 3.11 v5.

-Jeremiah

> On Nov 7, 2017, at 9:41 AM, Oleksandr Petrov  
> wrote:
> 
> Hi everyone,
> 
> Currently, 3.11 supports V5 as a protocol version. However, all new
> features are now going to 4.0, which is going to be a new feature release.
> 
> Right now we have two v5 features:
> 
>   - CASSANDRA-10786 
>   - CASSANDRA-12838 
> 
> 
> #12838 is adding duration type, which is a nice addition. #10786 is also
> useful, but is more of an edge cases for users with huge clusters and/or
> frequent schema changes.
> 
> If we leave v5 in 3.11, we'll have to always backport all v5 features to
> 3.11. This is something that hasn't been done in #10786. So the question
> is: are we ready to commit and support v5 in 3.11 "forever", or should we
> stop until it went too far and remove v5 from 3.11 since it's still in beta
> there.
> 
> Looking forward to hear your opinion,
> 
> 
> -- 
> Alex Petrov


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: URGENT: CASSANDRA-14092 causes Data Loss

2018-01-25 Thread Jeremiah D Jordan
If you aren’t getting an error, then I agree, that is very bad.  Looking at the 
3.0 code it looks like the assertion checking for overflow was dropped 
somewhere along the way, I had only been looking into 2.1 where you get an 
assertion error that fails the query.

-Jeremiah

> On Jan 25, 2018, at 2:21 PM, Anuj Wadehra  
> wrote:
> 
> 
> Hi Jeremiah,
> Validation is on TTL value not on (system_time+ TTL). You can test it with 
> below example. Insert is successful, overflow happens silently and data is 
> lost:
> create table test(name text primary key,age int);
> insert into test(name,age) values('test_20yrs',30) USING TTL 63072;
> select * from test where name='test_20yrs';
> 
>  name | age
> --+-
> 
> (0 rows)
> 
> insert into test(name,age) values('test_20yr_plus_1',30) USING TTL 
> 630720001;InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="ttl is too large. requested (630720001) maximum (63072)"
> ThanksAnuj
>On Friday 26 January 2018, 12:11:03 AM IST, J. D. Jordan 
>  wrote:  
> 
> Where is the dataloss?  Does the INSERT operation return successfully to the 
> client in this case?  From reading the linked issues it sounds like you get 
> an error client side.
> 
> -Jeremiah
> 
>> On Jan 25, 2018, at 1:24 PM, Anuj Wadehra  
>> wrote:
>> 
>> Hi,
>> 
>> For all those people who use MAX TTL=20 years for inserting/updating data in 
>> production, https://issues.apache.org/jira/browse/CASSANDRA-14092 can 
>> silently cause irrecoverable Data Loss. This seems like a certain TOP MOST 
>> BLOCKER to me. I think the category of the JIRA must be raised to BLOCKER 
>> from Major. Unfortunately, the JIRA is still "Unassigned" and no one seems 
>> to be actively working on it. Just like any other critical vulnerability, 
>> this vulnerability demands immediate attention from some very experienced 
>> folks to bring out an Urgent Fast Track Patch for all currently Supported 
>> Cassandra versions 2.1,2.2 and 3.x. As per my understanding of the JIRA 
>> comments, the changes may not be that trivial for older releases. So, 
>> community support on the patch is very much appreciated. 
>> 
>> Thanks
>> Anuj
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-14183 review request -> logback upgrade to fix CVE

2018-02-13 Thread Jeremiah D Jordan
I don’t think we need to stop the vote.  This CVE has been around for a while 
(3/13/2017), and does affect any install I have ever seen.  It affects users 
who manually enable some specific logback features using the SocketServer or 
ServerSocketReceiver component which are not used in our default settings (or 
by any install I have ever seen).

-Jeremiah

> On Feb 13, 2018, at 11:48 AM, Jason Brown  wrote:
> 
> Ariel,
> 
> If this is a legit CVE, then we would want to patch all the current
> versions we support - which is 2.1 and higher.
> 
> Also, is this worth stopping the current open vote for this patch? (Not in
> a place to look at the patch and affects to impacted branches right now).
> 
> Jason
> 
> On Tue, Feb 13, 2018 at 08:43 Ariel Weisberg  wrote:
> 
>> Hi,
>> 
>> Seems like users could conceivably be using the vulnerable component. Also
>> seems like like we need potentially need to do this as far back as 2.1?
>> 
>> Anyone else have an opinion before I commit this? What version to start
>> from?
>> 
>> Ariel
>> 
>> On Tue, Feb 13, 2018, at 5:59 AM, Thiago Veronezi wrote:
>>> Hi dev team,
>>> 
>>> Sorry to keep bothering you.
>>> 
>>> This is just a friendly reminder that I would like to contribute to this
>>> project starting with a fix for CASSANDRA-14183
>>> .
>>> 
>>> []s,
>>> Thiago.
>>> 
>>> 
>>> 
>>> On Tue, Jan 30, 2018 at 8:05 AM, Thiago Veronezi 
>>> wrote:
>>> 
 Hi dev team,
 
 Can one of you guys take a look on this jira ticket?
 https://issues.apache.org/jira/browse/CASSANDRA-14183
 
 It has an a patch available for a known security issue with one of the
 dependencies. It has only with trivial code changes. It should be
 straightforward to review it. Any feedback is very welcome.
 
 Thanks,
 Thiago
 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>> 
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-14183 review request -> logback upgrade to fix CVE

2018-02-13 Thread Jeremiah D Jordan
s/does affect/does not affect/

> On Feb 13, 2018, at 11:57 AM, Jeremiah D Jordan  
> wrote:
> 
> I don’t think we need to stop the vote.  This CVE has been around for a while 
> (3/13/2017), and does affect any install I have ever seen.  It affects users 
> who manually enable some specific logback features using the SocketServer or 
> ServerSocketReceiver component which are not used in our default settings (or 
> by any install I have ever seen).
> 
> -Jeremiah
> 
>> On Feb 13, 2018, at 11:48 AM, Jason Brown  wrote:
>> 
>> Ariel,
>> 
>> If this is a legit CVE, then we would want to patch all the current
>> versions we support - which is 2.1 and higher.
>> 
>> Also, is this worth stopping the current open vote for this patch? (Not in
>> a place to look at the patch and affects to impacted branches right now).
>> 
>> Jason
>> 
>> On Tue, Feb 13, 2018 at 08:43 Ariel Weisberg  wrote:
>> 
>>> Hi,
>>> 
>>> Seems like users could conceivably be using the vulnerable component. Also
>>> seems like like we need potentially need to do this as far back as 2.1?
>>> 
>>> Anyone else have an opinion before I commit this? What version to start
>>> from?
>>> 
>>> Ariel
>>> 
>>> On Tue, Feb 13, 2018, at 5:59 AM, Thiago Veronezi wrote:
>>>> Hi dev team,
>>>> 
>>>> Sorry to keep bothering you.
>>>> 
>>>> This is just a friendly reminder that I would like to contribute to this
>>>> project starting with a fix for CASSANDRA-14183
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-14183>.
>>>> 
>>>> []s,
>>>> Thiago.
>>>> 
>>>> 
>>>> 
>>>> On Tue, Jan 30, 2018 at 8:05 AM, Thiago Veronezi 
>>>> wrote:
>>>> 
>>>>> Hi dev team,
>>>>> 
>>>>> Can one of you guys take a look on this jira ticket?
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-14183
>>>>> 
>>>>> It has an a patch available for a known security issue with one of the
>>>>> dependencies. It has only with trivial code changes. It should be
>>>>> straightforward to review it. Any feedback is very welcome.
>>>>> 
>>>>> Thanks,
>>>>> Thiago
>>>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>>> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Expensive metrics?

2018-02-22 Thread Jeremiah D Jordan
re: nanoTime vs currentTimeMillis there is a good blog post here about the 
timing of both and how your choice of Linux clock source can drastically effect 
the speed of the calls, and also showing that in general on linux there is no 
perf improvement for one over the other.
http://pzemtsov.github.io/2017/07/23/the-slow-currenttimemillis.html

> On Feb 22, 2018, at 11:01 AM, Blake Eggleston  wrote:
> 
> Hi Micke,
> 
> This is really cool, thanks for taking the time to investigate this. I 
> believe the metrics around memtable insert time come in handy in identifying 
> high partition contention in the memtable. I know I've been involved in a 
> situation over the past year where we got actionable info from this metric. 
> Reducing resolution to milliseconds is probably a no go since most things in 
> this path should complete in less than a millisecond. 
> 
> Revisiting the use of the codahale metrics in the hot path like this 
> definitely seems like a good idea though. I don't think it's been something 
> we've talked about a lot, and it definitely looks like we could benefit from 
> using something more specialized here. I think it's worth doing, especially 
> since there won't be any major changes to how we do threading in 4.0. It's 
> probably also worth opening a JIRA and investigating the calls to nano time. 
> We at least need microsecond resolution here, and there could be something we 
> haven't thought of? It's worth a look at least.
> 
> Thanks,
> 
> Blake
> 
> On 2/22/18, 6:10 AM, "Michael Burman"  wrote:
> 
>Hi,
> 
>I wanted to get some input from the mailing list before making a JIRA 
>and potential fixes. I'll touch the performance more on latter part, but 
>there's one important question regarding the write latency metric 
>recording place. Currently we measure the writeLatency (and metric write 
>sampler..) in ColumnFamilyStore.apply() and this is also the metric we 
>then replicate to Keyspace metrics etc.
> 
>This is an odd place for writeLatency. Not to mention it is in a 
>hot-path of Memtable-modifications, but it also does not measure the 
>real write latency, since it completely ignores the CommitLog latency in 
>that same process. Is the intention really to measure 
>Memtable-modification latency only or the actual write latencies?
> 
>Then the real issue.. this single metric is a cause of huge overhead in 
>Memtable processing. There are several metrics / events in the CFS apply 
>method, including metric sampler, storageHook reportWrite, 
>colUpdateTimeDeltaHistogram and metric.writeLatency. These are not free 
>at all when it comes to the processing. I made a small JMH benchmark 
>here: https://gist.github.com/burmanm/b5b284bc9f1d410b1d635f6d3dac3ade 
>that I'll be referring to.
> 
>The most offending of all these metrics is the writeLatency metric. What 
>it does is update the latency in codahale's timer, doing a histogram 
>update and then going through all the parent metrics also which update 
>the keyspace writeLatency and globalWriteLatency. When measuring the 
>performance of Memtable.put with parameter of 1 partition (to reduce the 
>ConcurrentSkipListMap search speed impact - that's separate issue and 
>takes a little bit longer to solve although I've started to prototype 
>something..) on my machine I see 1.3M/s performance with the metric and 
>when it is disabled the performance climbs to 4M/s. So the overhead for 
>this single metric is ~2/3 of total performance. That's insane. My perf 
>stats indicate that the CPU is starved as it can't get enough data in.
> 
>Removing the replication from TableMetrics to the Keyspace & global 
>latencies in the write time (and doing this when metrics are requested 
>instead) improves the performance to 2.1M/s on my machine. It's an 
>improvement, but it's still huge amount. Even when we pressure the 
>ConcurrentSkipListMap with 100 000 partitions in one active Memtable, 
>the performance drops by about ~40% due to this metric, so it's never free.
> 
>i did not find any discussion replacing the metric processing with 
>something faster, so has this been considered before? At least for these 
>performance sensitive ones. The other issue is obviously the use of 
>System.nanotime() which by itself is very slow (two System.nanotime() 
>calls eat another ~1M/s from the performance)
> 
>My personal quick fix would be to move writeLatency to Keyspace.apply, 
>change write time aggregates to read time processing (metrics are read 
>less often than we write data) and maybe even reduce the nanotime -> 
>currentTimeMillis (even given it's relative lack of precision). That is 
>- if these metrics make any sense at all at CFS level? Maybe these 
>should be measured from the network processing time (including all the 
>deserializations and such) ? Especia

Re: Debug logging enabled by default since 2.2

2018-03-19 Thread Jeremiah D Jordan
People seem hung up on DEBUG here.  The goal of CASSANDRA-10241 was
to clean up the system.log so that it a very high “signal” in terms of what was 
logged
to it synchronously, but without reducing the ability of the logs to allow 
people to
solve problems and perform post mortem analysis of issues.  We have 
informational
log messages that are very useful to understanding the state of things, like 
compaction
status, repair status, flushing, or the state of gossip in the system that are 
very useful to
operators, but if they are all in the system.log make said log file harder to 
look over for
issues.  In 10241 the method chosen for how to keep these log messages around by
default, but get them out of the system.log was that these messages were 
changed from
INFO to DEBUG and the new debug.log was created.

From the discussion here it seems that many would like to change how this 
works.  Rather
than just turning off the debug.log I would propose that we switch to using the 
SLF4J
MARKER[1] ability to move the messages back to INFO but tag them as belonging to
the asynchronous_system.log rather than the normal system.log.

[1] https://logback.qos.ch/manual/layouts.html#marker 

https://www.slf4j.org/faq.html#fatal 


> On Mar 19, 2018, at 9:01 AM, Stefan Podkowinski  wrote:
> 
> I'd agree that INFO should be the default. Turning on the DEBUG logging
> can cause notable performance issues and I would not enable it on
> production systems unless I really have to. That's why I created 
> CASSANDRA-12696 for 4.0, so you'll be able to at least only partially
> enable DEBUG based on what's relevant to look at, e.g. `nodetool
> setlogginglevel bootstrap DEBUG`.
> 
> But small improvements like that won't change the fact that log files
> suck in general for more complex analysis, except for trivial tailing
> and grepping. You have to make sure that logging is enabled and old
> records you're interested in will not be rotated out. Then you have to
> gather log files from individual nodes somehow. Eventually I end up with
> a local tarball with logs in that situation and the fun starts creating
> hacky, regex loaded Python scripts to parse them. As each log message is
> limited to a single line of text, it's often missing out relevant
> details. You also got to create different parsers for different messages
> of course. It's just inefficient and too time consuming to gather
> information that way. Let alone implementing more advanced monitoring
> solutions on top of that.
> 
> That's exactly why I started working on the "diagnostic events"
> (CASSANDRA-12944) idea more than a year ago. There's also support for
> persistency (CASSANDRA-13460) that would implement storing important but
> infrequent events as rich json objects in a local keyspace and allows
> retrieving them by using CQL. I still like the idea and think it's worth
> pursuing.
> 
> 
> On 19.03.18 09:53, Alain RODRIGUEZ wrote:
>> Hello,
>> 
>> I am not developing Cassandra, but I am using it actively and helping
>> people to work with it. My perspective might be missing some code
>> considerations and history as I did not go through the ticket where this
>> 'debug' level was added by default. But here is a feedback after upgrading
>> a few clusters to Cassandra 2.2:
>> 
>> When upgrading a cluster to Cassandra 2.2, 'disable the debug logs' is in
>> my runbook. I mean, very often, when some cluster is upgraded to Cassandra
>> 2.2 and has problems with performances, the 2 most frequent issues are:
>> 
>> - DEBUG level being turned on
>> - and / or dynamic snitching being enabled
>> 
>> This is especially true for high percentile (very clear on p99). Let's put
>> the dynamic snitch aside as it is not our topic here.
>> 
>> From an operational perspective, I prefer to set the debug level to 'DEBUG'
>> when I need it than having, out of the box, something that is unexpected
>> and impact performances. Plus the debug level can be changed without
>> restarting the node, through 'JMX' or even using 'nodetool' now.
>> 
>> Also in most cases, the 'INFO' level is good enough for me to detect most
>> of the issues. I was even able to recreate a detailed history of events for
>> a customer recently, 'INFO' logs are already very powerful and complete I
>> believe (nice work on this by the way). Then monitoring is helping a lot
>> too. I did not have to use debug logs for a long time. It might happen, but
>> I will find my way to enable them.
>> 
>> Even though it feels great to be able to help people with that easily
>> because the cause is often the same and turning off the logs is a
>> low hanging fruit in C*2.2 clusters that have very nice results and is easy
>> to achieve, I would prefer people not to fall into these performances traps
>> in the first place. In my head, 'Debug' logs should be for debug purposes
>> (by opposition to 'always on'). It seems legit. I am surprise

Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-20 Thread Jeremiah D Jordan
> Stefan's elastic search link is rather interesting. Looks like they are
> compiling for both a LTS version as well as the current OpenJDK. They
> assume some of their users will stick to a LTS version and some will run
> the current version of OpenJDK.
> 
> While it's extra work to add JDK version as yet another matrix variable in
> addition to our branching, is that something we should consider? Or are we
> going to burden maintainers even more? Do we have a choice? Note: I think
> this is similar to what Jeremiah is proposed.

Yes, this is basically what I was proposing for trunk.


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Paying off tech debt and correctly naming things

2018-03-21 Thread Jeremiah D Jordan
+1 if you are willing to take it on.  As the person who performed the 
Table->Keyspace rename of 2.0, I say good luck!  From hindsight of doing that, 
as others suggested, I would come at this in multiple tickets.
I would suggest a simple class rename with intellij refactoring tools or 
something as the first ticket.  This is going to touch the most files at once, 
but will be mechanical, and for the most part if it compiles it was right :).
After you have done that you can take on other renaming of things with a 
smaller scope.
Also as others have said the main things to be wary of are the naming of things 
in JMX metrics.  Ideally we would keep around deprecated aliases of the old JMX 
names for a release before removing them.  The other thing is to watch out for 
class names in byte man scripts in dtest.

-Jeremiah

> On Mar 21, 2018, at 4:48 AM, Sylvain Lebresne  wrote:
> 
> I really don't think anyone has been recently against such renaming, and in
> fact, a _lot_ of renaming *has* already happen over time. The problem, as
> you carefully noted, is that it's such a big task that there is still a lot
> to do. Anyway, I've yet to see a patch renaming things to match the CQL
> naming scheme be rejected, so I'd personally encourage such submission. But
> maybe with a few caveats (already mentioned largely, so repeating here to
> signify my personal agreement with them):
> - renaming with large surface area can be painful for ongoing patches or
> even future merge. That's not a reason for not doing them, but that's imo a
> good enough reason to do things incrementally/in as-small-as-reasonable
> steps. Making sure a renaming commit only does renaming and doesn't change
> the logic is also pretty nice when you rebase such things.
> - breaking hundreds of tests is obviously not ok :)
> - pure code renaming is one reasonably simple aspect, but quite a few
> renaming may have user visible impact. Particularly around JMX where many
> things are name based on their class, and to a lesser extend some of our
> tools still use "old" naming. We can't and shouldn't ignore those impact:
> such user visible changes should imo be documented, and we should make sure
> we have a reasonably painless (and thus incremental) upgrade path. My hunch
> is the latter isn't as simple as it seems.
> 
> 
> --
> Sylvain
> 
> 
> On Wed, Mar 21, 2018 at 9:06 AM kurt greaves  wrote:
> 
>> As someone who came to the codebase post CQL but prior to thrift being
>> removed, +1 to refactor. The current mixing of terminology is a complete
>> nightmare. This would also give a good opportunity document a lot of code
>> that simply isn't documented (or incorrect). I'd say it's worth doing it in
>> multiple steps though, such as refactor of a single class at a time, then
>> followed by refactor of variable names. We've already done one pretty big
>> refactor (InetAddressAndPort) for 4.0, I don't see how a few more could
>> make it any worse (lol).
>> 
>> Row vs partition vs key vs PK is killing me
>> 
>> On 20 March 2018 at 22:04, Jon Haddad  wrote:
>> 
>>> Whenever I hop around in the codebase, one thing that always manages to
>>> slow me down is needing to understand the context of the variable names
>>> that I’m looking at.  We’ve now removed thrift the transport, but the
>>> variables, classes and comments still remain.  Personally, I’d like to go
>>> in and pay off as much technical debt as possible by refactoring the code
>>> to be as close to CQL as possible.  Rows should be rows, not partitions,
>>> I’d love to see the term column family removed forever in favor of always
>>> using tables.  That said, it’s a big task.  I did a quick refactor in a
>>> branch, simply changing the ColumnFamilyStore class to TableStore, and
>>> pushed it up to GitHub. [1]
>>> 
>>> Didn’t click on the link?  That’s ok.  The TL;DR is that it’s almost 2K
>>> LOC changed across 275 files.  I’ll note that my branch doesn’t change
>> any
>>> of the almost 1000 search results of “columnfamilystore” found in the
>>> codebase and hundreds of tests failed on my branch in CircleCI, so that
>> 2K
>>> LOC change would probably be quite a bit bigger.  There is, of course, a
>>> lot more than just renaming this one class, there’s thousands of variable
>>> names using any manner of “cf”, “cfs”, “columnfamily”, names plus
>> comments
>>> and who knows what else.  There’s lots of references in probably every
>> file
>>> that would have to get updated.
>>> 
>>> What are people’s thoughts on this?  We should be honest with ourselves
>>> and know this isn’t going to get any easier over time.  It’s only going
>> to
>>> get more confusing for new people to the project, and having to figure
>> out
>>> “what kind of row am i even looking at” is a waste of time.  There’s
>>> obviously a much bigger impact than just renaming a bunch of files,
>> there’s
>>> any number of patches and branches that would become outdated, plus
>> anyone
>>> pulling in Cassandra as a dependency would be af

Re: Submit enhancements via pull requests?

2013-12-05 Thread Jeremiah D Jordan
JIRA + patch or link to git branch

-Jeremiah

On Dec 5, 2013, at 9:44 AM, Brian O'Neill  wrote:

> 
> Sorry guys, it¹s been a while since I submitted a patch.
> 
> I see there are a number of outstanding pull requests:
> https://github.com/apache/cassandra/pulls
> 
> Are we able to submit enhancements via pull requests on github now?
> Or are we still using JIRA + patches?
> 
> (I have a very minor change to an error message that I¹d like to get in
> there)
> 
> thanks,
> brian
> 
> ---
> Brian O'Neill
> Chief Architect
> Health Market Science
> The Science of Better Results
> 2700 Horizon Drive € King of Prussia, PA € 19406
> M: 215.588.6024 € @boneill42    €
> healthmarketscience.com
> 
> 
> This information transmitted in this email message is for the intended
> recipient only and may contain confidential and/or privileged material. If
> you received this email in error and are not the intended recipient, or the
> person responsible to deliver it to the intended recipient, please contact
> the sender at the email above and delete this email and any attachments and
> destroy any copies thereof. Any review, retransmission, dissemination,
> copying or other use of, or taking any action in reliance upon, this
> information by persons or entities other than the intended recipient is
> strictly prohibited.
> 
> 
> 



Re: CQL unit tests vs dtests

2014-05-22 Thread Jeremiah D Jordan
The only thing I worry about here is that the unit tests don't come into the 
system the same way user queries will.  So we still need the system level 
dtests.  So I don't think all CQL tests should be unit tests, but I am all for 
there being unit level CQL tests.

On May 22, 2014, at 10:58 AM, Sylvain Lebresne  wrote:

> On Wed, May 21, 2014 at 10:46 PM, Jonathan Ellis  wrote:
> 
>> I do think that CQL tests in general make more sense as unit tests,
>> but I'm not so anal that I'm going to insist on rewriting existing
>> ones.  But in theory, if I had an infinite army of interns, sure. I'd
>> have one of them do that. :)
>> 
>> But in the real world, compared to saying "we don't have any cql unit
>> tests, so we should always write them as dtests to be consistent" I
>> think having mixed unit + dtests is the lesser of evils.
>> 
> 
> Fair enough. I'll try to make CQL tests as unit tests from now on as much
> as possible (I can't promise a few misfire in the short term). Let's hope
> you
> find your infinite intern army someday.
> 
> --
> Sylvain



Re: [VOTE] Release Apache Cassandra 2.0.10

2014-08-08 Thread Jeremiah D Jordan
I'm -1 on this until we get CqlRecordReader fixed (which will also fix the 
newly added in 2.0.10 Pig CqlNativeStoarge):
https://issues.apache.org/jira/browse/CASSANDRA-7725
https://issues.apache.org/jira/browse/CASSANDRA-7726

Without those two things anyone using CqlStorage previously (which removed with 
the removal of CPRR) who updates to using CqlNativeStoarge will have broken 
scripts unless they are very very careful.


-Jeremiah

On Aug 8, 2014, at 5:03 AM, Sylvain Lebresne  wrote:

> I propose the following artifacts for release as 2.0.10.
> 
> sha1: cd37d07baf5394d9bac6763de4556249e9837bb0
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/2.0.10-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1023/org/apache/cassandra/apache-cassandra/2.0.10/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1023/
> 
> The artifacts as well as the debian package are also available here:
> http://people.apache.org/~slebresne/
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: http://goo.gl/xzb9ky (CHANGES.txt)
> [2]: http://goo.gl/nBI37B (NEWS.txt)



Re: [DISCUSS] Semantic versioning after 4.0

2021-04-30 Thread Jeremiah D Jordan
+1 for doing this (or something similar).  It will give more clarity to 
downstream users about the compatibility of a given release.

-Jeremiah

> On Apr 30, 2021, at 12:45 PM, Mick Semb Wever  wrote:
> 
> *** Proposal ***
> Aligned to the agreed-upon annual cadence of supported releases, let's
> use semantic versioning for better ecosystem operatibility, and to
> promote API awareness and compatibility support from documentation to
> tests.
> 
> 
> *** Background ***
> The recent¹ dev ML thread 'Releases after 4.0' landed on an annual
> release cadence, and for promoting an always shippable trunk (repeated
> again in the roadmap thread²).
> 
> A digression that occurred in the thread was around the use of
> semantic versioning, and the possible role of properly using major and
> minor versions within the annual release cycle. This proposal is an
> attempt to take those points of view and build them on everything else
> we have agreed upon so far.
> 
> 
> *** Ecosystem Operability ***
> The Cassandra codebase has an ecosystem around it. From downstream
> projects to vendors providing support for versions to managed DBaaS.
> 
> We can help them out with semver, and by providing unreleased minor
> versions through the year. Unreleased means we don’t do a formal
> Apache release approval, we just bump the version in `build.xml`.
> Downstream projects face overhead when, either trying to keep up with
> trunk through each annual development cycle, or trying to rebase
> against a whole year's worth of development once each year.
> Unreleased versions will provide safe points for the ecosystem to plug
> into and keep up with. Vendors are also free to support and provide
> hot-fixes and back ports on these unreleased versions, outside of the
> community's efforts or concerns. And of course semver provides a lot
> of value to downstream codebases.
> 
> 
> *** API and Compatibility Awareness ***
> The idea here is to provide awareness and improved documentation to
> our APIs, their audience, and to what compatibility is required on
> them. Personally, I still struggle getting my head around all the ways
> Cassandra can break its APIs and what to think about and to test when
> coding.
> 
> This is important for ensuring availability during upgrades
> (mix-version clusters), and again important if we want to introduce
> data-safe downgrades. This stuff doesn't get (battle-) tested enough.
> The native protocol bump to v6 was an example for the need to be
> better at documenting and testing what's involved (across the
> ecosystem).
> 
> The consequences of breaking compatibility range from documentation,
> and tests, to mixed versioned clusters, upgrade and rollback
> operations. Semantic versioning is a way of foreseeing and preparing
> for such changes. In practice this can be done
>  a) using different fixVersions in jira ticket, and
>  b) lazy-incrementing the major version in trunk when the first
> breaking change lands in the development cycle.
> 
> For example, we enter the next development cycle with Jira fixVersions
> of "4.X" and "5.X", and an initial trunk version of "4.1". Then when a
> committer merges the first "5.X" ticket they bump trunk's version up
> to "5.0".
> 
> This approach incentivises patches to be aware and to better document
> the breakage, and comes with the added benefit for the ecosystem of
> identifying where in the development cycle the compatibility first
> broke.
> 
> Some examples of compatibility areas are CQL, Native Protocol, gossip,
> JMX, Metrics, Virtual Tables, SSTable, CDC, Commitlog, FQL, and
> Auditing. Many of these don't have enough documentation of how they
> are versioned and compatibility. As we add pluggability (i.e. SPIs)
> both the need to document this, and to be closer with the ecosystem
> increases.
> 
> 
> *** Example for 2021-2022 ***
> Illustrating this in action, with a cadence of a minor version every quarter,
> 
> - today, we branch `cassandra-4.0` and increment trunk to 4.1
> - commits roll into trunk, no "5.X" tickets have landed yet,
> - in July we increment the version to 4.2, no release is made or announced,
> - commits continue to roll into trunk, still no "5.X" tickets have landed yet,
> - in October we increment the version to 4.3
> - commits continue to roll into trunk, a "5.X" patch lands, trunk is
> incremented to 5.0
> - in January 2022 we increment the value to 5.1, no release is made or
> announced,
> - commits continue to roll into trunk,
> - in April 2022 we formally release 5.1 and branch `cassandra-5.1`
> 
> 
> The cadence of those minor versions could be anything, quarterly,
> monthly or on-demand. This practice will force us to organise and
> automate dealing with version changes, creating our release branches,
> organising our test upgrade version paths. I'm gathering that process
> currently in CASSANDRA-16642.
> 
> Jeremiah originally (and in more depth) illustrated this here:
> https://lists.apache.org/thread.html/r9b53342e6992cf98e8

Re: [DISCUSSION] Should we mark DROP COMPACT STORAGE as experimental

2021-06-07 Thread Jeremiah D Jordan
We had many discussions around this back when this was added.  There is a 
transition ability in place.  Users can set a native protocol flag to have the 
server return results as if DROP COMPACT STORAGE was already run.  In this way 
you can update your applications to support the new way results are returned 
before the change has been made server side.  In the face of multiple 
applications you can update them one at a time, switch each over with the 
protocol flag.  Once all of your applications are updated and running with the 
protocol flag set, so they now deal with how data is returned when DROP COMPACT 
STORAGE has been run, you can then finally run DROP COMPACT STORAGE on the 
server itself to update the schema.

This is not the case where someone needs to run DROP COMPACT STORAGE and then 
deal with the fallout.

-Jeremiah

> On Jun 7, 2021, at 3:06 AM, Oleksandr Petrov  
> wrote:
> 
> Thank you for bringing this subject up.
> 
>> not ready for production use unless users fully understand what they are
> doing.
> 
> Thing is, there's no easy way around dropping compact storage.  At the
> moment of writing of 10857, we have collectively decided that we'll
> document that the new columns are going to be shown, and have added a
> client protocol option that would hide/show columns depending on the mode
> we're running it in for anyone who upgrades. This makes it harder to make
> a transition for anyone who controls only the server side, since you have
> to account for how clients would behave whenever they see a new column. We
> did try to patch around the shown columns, but because of ColumnFilter this
> also turned out to be non-trivial, or at least not worth the effort for the
> moment.
> 
> One of the things mentioned in this list (primary key liveness) is also
> existing as a difference between UPDATE and INSERT, but I'm not sure if
> it's properly documented. Similar to some other nuances, such as nulls in
> clustering keys on partitions that only have a static row. We did recently
> discuss some of these not-commonly-known cases with Benjamin and some other
> folks. So it might be worth documenting those, too.
> 
> Problem with compact storage is that very few people want to touch it, and
> it requires a non-trivial amount of "institutional" knowledge and
> remembering things about Thrift. I think it's OK to mark the feature as
> experimental, but remembering how we haven't made significant improvements
> to things we have previously marked as experimental, this one may not
> materialise into something final, too.
> 
> What would a complete, non-foot-gun solution for dropping compact storage
> entail? If we're talking about avoiding showing columns to users, there are
> ways to achieve this without rewriting sstables, for example, by
> introducing "hidden" columns in table metadata. However, if we want to
> preserve deletion semantics, I'm not sure if we're doing it right at all:
> we'll just trade one source of difference for partition liveness for insert
> queries for the other, so I'd say that, by executing ALTER TABLE statement,
> you're accepting that after it propagates, there will be at least some
> difference in behaviour and semantics. We did discuss this in C-16069, and
> my thesis back then was that replacing special-casing for compact tables
> with special casing for tables that "used to be compact" isn't  bringing us
> closer to the final solution.
> 
> To summarise, I don't mind if we mark this feature experimental, but if we
> want to ever make it complete, we have to discuss what we do with each of
> the special cases. And it may very well be that we just need to add
> explicit hidden columns to metadata, and allow nulls for clusterings, maybe
> several more small changes. Unless we define what it would take to get this
> feature out of experimental state, and actually make an effort to resolve
> these issues, I'd just put a huge warning and call it a power-user feature.
> 
> 
> On Fri, Jun 4, 2021 at 5:01 PM Joshua McKenzie  wrote:
> 
>>> 
>>> not ready for production use unless users fully understand what they are
>>> doing.
>> 
>> This statement stood out to me - in my opinion we should think carefully
>> about the surface area of the user interfaces on new features before we add
>> more cognitive burden to our users. We already have plenty of "foot-guns"
>> in the project and should only add more if absolutely necessary.
>> 
>> Further, marking this as experimental would be another feature we've
>> released and then retroactively marked as experimental; that's a habit we
>> should not get into.
>> 
>> On balance, my .02 is the benefits to our end users and operators of
>> getting 4.0 to GA outweigh the costs of flagging this as experimental now
>> so I'm a +1 to the flagging idea, but I think there's some valuable lessons
>> for us to learn in retrospect from not just this feature but others like it
>> in the past.
>> 
>> Curious to hear Alex' thoughts about this situatio

Re: [DISCUSS] CEP-10: Cluster and Code Simulations

2021-07-13 Thread Jeremiah D Jordan
I tend to agree with Paulo that a major refactoring of some internal interfaces 
sounds like something to be explicitly avoided in a patch release.  I thought 
this was the type of change we all agreed we should stop letting in to patch 
releases, and that we would attempt to release more often (once a year) so 
changes that only go to trunk would get out faster?  Are we really wanting to 
break that promise to ourselves before we even release 4.0?  To me “I think we 
need this feature released faster” is not a reason to put it in 4.0, it could 
be a reason to release 4.1 sooner.  This is where having a releasable trunk 
helps, as if we decided as a project that some change was worth a new major 
being released early the effort of doing that release is much smaller when 
trunk is releasable.

Any fix we make in 4.0 would be merged forward into trunk and could be fully 
verified there?  Probably not the best, but would give more confidence in a fix 
than otherwise without adding other major changes to 4.0?

-Jeremiah

> On Jul 13, 2021, at 7:59 AM, Benjamin Lerer  wrote:
> 
>> 
>> Furthermore, we introduced a significant performance regression in all
>> lines of the software by increasing the number of LWT round-trips. Unless
>> we intend to leave this regression for a further year without _any_ release
>> offering a solution, we will need suitable verification mechanisms for
>> whatever fixes we deliver.
>> 
>> My view is that it is unacceptable to leave such a significant regression
>> unaddressed in all lines of software we intend to release for the
>> foreseeable future.
> 
> 
> I would like to expand a bit on this as I believe it might be important for
> people to have the full picture. The fix for  CASSANDRA-12126
>  introduced a
> regression by increasing the number of LWT round-trips. Nevertheless, the
> patch introduced a flag to allow users to revert to the previous behavior
> (previous performance + consistency issue).
> 
> Also the patch did not address all paxos consistency issues. There are
> still some issues during topologie changes (may be in some other scenarios).
> 
> My understanding of Benedict's proposal is to fix paxos once and for all
> without any performance regression.
> 
> That goal makes total sense to me. "Where do we do that?" is a more tricky
> question.
> 
> Le mar. 13 juil. 2021 à 14:46, bened...@apache.org  a
> écrit :
> 
>> Hmm. It occurs to me I’m not entirely sure how our new release process is
>> going to work.
>> 
>> Will we be releasing 4.1 builds immediately, as part of shippable trunk?
>> Or will 4.0 be our only active line of software for the next year?
>> 
>> Either way, I bet my bottom dollar there will come some regret if we
>> introduce such divergence between the two most active branches we maintain,
>> so early in their lifecycles. If we invest significant resources in
>> improved testing using this framework (which I very much expect) then
>> branches that are not compatible will not benefit, likely reducing their
>> quality; and the risk of backports will increase, due to divergence.
>> 
>> Altogether, I think it would be a huge mistake. But if we will be shipping
>> releases soon that can fix these aforementioned regressions, I won’t
>> campaign for it.
>> 
>> 
>> 
>> From: bened...@apache.org 
>> Date: Tuesday, 13 July 2021 at 13:31
>> To: dev@cassandra.apache.org 
>> Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
>> No change is without risk; we have introduced serious regressions with bug
>> fixes to patch releases. The overall risk to the release lifecycle is
>> reduced significantly in my opinion, as we reduce the likelihood of
>> introducing regressions, and can use the same test infrastructure across
>> all of the actively developed releases, increasing our confidence in 4.0.x
>> releases.
>> 
>> Furthermore, we introduced a significant performance regression in all
>> lines of the software by increasing the number of LWT round-trips. Unless
>> we intend to leave this regression for a further year without _any_ release
>> offering a solution, we will need suitable verification mechanisms for
>> whatever fixes we deliver.
>> 
>> My view is that it is unacceptable to leave such a significant regression
>> unaddressed in all lines of software we intend to release for the
>> foreseeable future.
>> 
>> 
>> From: Paulo Motta 
>> Date: Tuesday, 13 July 2021 at 13:21
>> To: Cassandra DEV 
>> Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
>>> No, in my opinion the target should be 4.0.x. We are reaching for a
>> shippable trunk and this has no public API impacts. This work is IMO
>> central to achieving a shippable trunk, either way. The only reason I do
>> not target 3.x is that it would be too burdensome.
>> 
>> In my limited view of the proposal, a major refactor of internal
>> concurrency APIs to support the testing facility potentially risks the
>> stability of a minor rel

Re: [DISCUSS] CEP-10: Cluster and Code Simulations

2021-07-13 Thread Jeremiah D Jordan
I do not think fixing CASSANDRA-12126 is not a new feature.  I do think adding 
the ability to do “Cluster and Code Simulations” is a new feature.

-Jeremiah

> On Jul 13, 2021, at 8:37 AM, bened...@apache.org wrote:
> 
> Nothing we’re discussing constitutes a feature. We’re discussing stability 
> enhancements, and important bug fixes.
> 
> I think this disagreement is to some extent founded on our different premises 
> about what a patch release should contain, and this seems to be the fault of 
> incompletely specified documentation.
> 
> 1. The release lifecycle only forbids feature work from being developed in a 
> patch release, and only expressly includes bug fixes. Note that, the document 
> even has a comment by the author suggesting that features may be backported 
> to a patch release from trunk (not something I agree with, but it 
> demonstrates the ambiguity of the definition).
> 2. There seems to be some conflation of size-of-change with the admissibility 
> wrt release lifecycle – I don’t think there’s any criteria here, and it’s 
> open to the community’s case-by-case assessment. Whatever we do to fix the 
> bug in question will necessarily be a very significant piece of work itself, 
> for instance.
> 
> My interpretation of the release lifecycle document is that it is acceptable 
> to include this work in a patch release. My belief about its impact is that 
> it would contribute positively to the stability of the project’s 4.0 releases 
> over the lifecycle, and improve project velocity.
> 
> With respect to whether we can ship a fix to 12126 without validation, I 
> would be strongly opposed to this, and certainly would not produce a patch 
> myself in this way. Not only would it be burdensome (given the divergences in 
> the codebase), but I would not consider it acceptably safe (given the 
> divergence).
> 
> 
> From: Jeremiah D Jordan 
> Date: Tuesday, 13 July 2021 at 14:15
> To: Cassandra DEV 
> Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
> I tend to agree with Paulo that a major refactoring of some internal 
> interfaces sounds like something to be explicitly avoided in a patch release. 
>  I thought this was the type of change we all agreed we should stop letting 
> in to patch releases, and that we would attempt to release more often (once a 
> year) so changes that only go to trunk would get out faster?  Are we really 
> wanting to break that promise to ourselves before we even release 4.0?  To me 
> “I think we need this feature released faster” is not a reason to put it in 
> 4.0, it could be a reason to release 4.1 sooner.  This is where having a 
> releasable trunk helps, as if we decided as a project that some change was 
> worth a new major being released early the effort of doing that release is 
> much smaller when trunk is releasable.
> 
> Any fix we make in 4.0 would be merged forward into trunk and could be fully 
> verified there?  Probably not the best, but would give more confidence in a 
> fix than otherwise without adding other major changes to 4.0?
> 
> -Jeremiah
> 
>> On Jul 13, 2021, at 7:59 AM, Benjamin Lerer  wrote:
>> 
>>> 
>>> Furthermore, we introduced a significant performance regression in all
>>> lines of the software by increasing the number of LWT round-trips. Unless
>>> we intend to leave this regression for a further year without _any_ release
>>> offering a solution, we will need suitable verification mechanisms for
>>> whatever fixes we deliver.
>>> 
>>> My view is that it is unacceptable to leave such a significant regression
>>> unaddressed in all lines of software we intend to release for the
>>> foreseeable future.
>> 
>> 
>> I would like to expand a bit on this as I believe it might be important for
>> people to have the full picture. The fix for  CASSANDRA-12126
>> <https://issues.apache.org/jira/browse/CASSANDRA-12126> introduced a
>> regression by increasing the number of LWT round-trips. Nevertheless, the
>> patch introduced a flag to allow users to revert to the previous behavior
>> (previous performance + consistency issue).
>> 
>> Also the patch did not address all paxos consistency issues. There are
>> still some issues during topologie changes (may be in some other scenarios).
>> 
>> My understanding of Benedict's proposal is to fix paxos once and for all
>> without any performance regression.
>> 
>> That goal makes total sense to me. "Where do we do that?" is a more tricky
>> question.
>> 
>> Le mar. 13 juil. 2021 à 14:46, bened...@apache.org  a
>> écrit :
>> 
>>> Hmm. It occurs to me I’m not e

Re: [DISCUSS] CEP-10: Cluster and Code Simulations

2021-07-13 Thread Jeremiah D Jordan
Too many nots.  I do not think fixing 12126 is a new feature.

> On Jul 13, 2021, at 8:40 AM, Jeremiah D Jordan  wrote:
> 
> I do not think fixing CASSANDRA-12126 is not a new feature.  I do think 
> adding the ability to do “Cluster and Code Simulations” is a new feature.
> 
> -Jeremiah
> 
>> On Jul 13, 2021, at 8:37 AM, bened...@apache.org wrote:
>> 
>> Nothing we’re discussing constitutes a feature. We’re discussing stability 
>> enhancements, and important bug fixes.
>> 
>> I think this disagreement is to some extent founded on our different 
>> premises about what a patch release should contain, and this seems to be the 
>> fault of incompletely specified documentation.
>> 
>> 1. The release lifecycle only forbids feature work from being developed in a 
>> patch release, and only expressly includes bug fixes. Note that, the 
>> document even has a comment by the author suggesting that features may be 
>> backported to a patch release from trunk (not something I agree with, but it 
>> demonstrates the ambiguity of the definition).
>> 2. There seems to be some conflation of size-of-change with the 
>> admissibility wrt release lifecycle – I don’t think there’s any criteria 
>> here, and it’s open to the community’s case-by-case assessment. Whatever we 
>> do to fix the bug in question will necessarily be a very significant piece 
>> of work itself, for instance.
>> 
>> My interpretation of the release lifecycle document is that it is acceptable 
>> to include this work in a patch release. My belief about its impact is that 
>> it would contribute positively to the stability of the project’s 4.0 
>> releases over the lifecycle, and improve project velocity.
>> 
>> With respect to whether we can ship a fix to 12126 without validation, I 
>> would be strongly opposed to this, and certainly would not produce a patch 
>> myself in this way. Not only would it be burdensome (given the divergences 
>> in the codebase), but I would not consider it acceptably safe (given the 
>> divergence).
>> 
>> 
>> From: Jeremiah D Jordan 
>> Date: Tuesday, 13 July 2021 at 14:15
>> To: Cassandra DEV 
>> Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
>> I tend to agree with Paulo that a major refactoring of some internal 
>> interfaces sounds like something to be explicitly avoided in a patch 
>> release.  I thought this was the type of change we all agreed we should stop 
>> letting in to patch releases, and that we would attempt to release more 
>> often (once a year) so changes that only go to trunk would get out faster?  
>> Are we really wanting to break that promise to ourselves before we even 
>> release 4.0?  To me “I think we need this feature released faster” is not a 
>> reason to put it in 4.0, it could be a reason to release 4.1 sooner.  This 
>> is where having a releasable trunk helps, as if we decided as a project that 
>> some change was worth a new major being released early the effort of doing 
>> that release is much smaller when trunk is releasable.
>> 
>> Any fix we make in 4.0 would be merged forward into trunk and could be fully 
>> verified there?  Probably not the best, but would give more confidence in a 
>> fix than otherwise without adding other major changes to 4.0?
>> 
>> -Jeremiah
>> 
>>> On Jul 13, 2021, at 7:59 AM, Benjamin Lerer  wrote:
>>> 
>>>> 
>>>> Furthermore, we introduced a significant performance regression in all
>>>> lines of the software by increasing the number of LWT round-trips. Unless
>>>> we intend to leave this regression for a further year without _any_ release
>>>> offering a solution, we will need suitable verification mechanisms for
>>>> whatever fixes we deliver.
>>>> 
>>>> My view is that it is unacceptable to leave such a significant regression
>>>> unaddressed in all lines of software we intend to release for the
>>>> foreseeable future.
>>> 
>>> 
>>> I would like to expand a bit on this as I believe it might be important for
>>> people to have the full picture. The fix for  CASSANDRA-12126
>>> <https://issues.apache.org/jira/browse/CASSANDRA-12126> introduced a
>>> regression by increasing the number of LWT round-trips. Nevertheless, the
>>> patch introduced a flag to allow users to revert to the previous behavior
>>> (previous performance + consistency issue).
>>> 
>>> Also the patch did not address all paxos consistency issues. There are
>>> still some issues during topolog

  1   2   >