Re: [DISCUSSION] New dependencies for SAI CEP-7

2022-12-13 Thread Josh McKenzie
Whatever we decide on, let's make sure we document it so newcomers on the project (or really anyone new to property based testing) can better discover those things. https://cassandra.apache.org/_/development/testing.html On Tue, Dec 13, 2022, at 1:08 PM, David Capwell wrote: > Speaking to Caleb

Re: [VOTE] CEP-25: Trie-indexed SSTable format

2022-12-19 Thread Josh McKenzie
+1 On Mon, Dec 19, 2022, at 11:54 AM, SAURABH VERMA wrote: > +1 > > On Mon, Dec 19, 2022 at 9:36 PM Benjamin Lerer wrote: >> +1 >> >> Le lun. 19 déc. 2022 à 16:31, Andrés de la Peña a >> écrit : >>> +1 >>> >>> On Mon, 19 Dec 2022 at 15:11, Aleksey Yeshchenko wrote: +1 > On 19

Cassandra project status, Year in Review Holiday Edition

2022-12-19 Thread Josh McKenzie
As you may have heard, Cassandra-4.1 is GA! Congrats to everyone that worked hard to get this release out the door. I'm certain users of Cassandra are going to appreciate the new functionality in the release combined with the robust testing and validation we've all done on this so keep the high

[DISCUSS] Taking another(other(other)) stab at performance testing

2022-12-30 Thread Josh McKenzie
There was a really interesting presentation from the Lucene folks at ApacheCon about how they're doing perf regression testing. That combined with some recent contributors wanting to get involved on some performance work and not having much direction or clarity on how to get involved led some of

Re: [EXTERNAL] [DISCUSS] Taking another(other(other)) stab at performance testing

2023-01-03 Thread Josh McKenzie
test should be checked in as well - also we > shoudl encourage people not to change too much from the reference test. > Different hardware, different cassandra.yaml, and different tests will just > create numbers which are hard to make sense of. > > Really excited about this - tha

Re: Cassandra CI Status 2023-01-07

2023-01-10 Thread Josh McKenzie
> I don't believe it warrants a CEP, speak up if you disagree. I agree with this but I'm also biased having been working w/you on this for a bit. My instinct is that most folks on the project want CI that works consistently, quickly, and is minimally complex to modify. So the less disruptive an

Re: Should we change 4.1 to G1 and offheap_objects ?

2023-01-12 Thread Josh McKenzie
Potential compromise: We change it in trunk, and we NEWS.txt in the minor about that change in trunk, why, and recommend users consider qualifying the same change on their 4.1 release. In case it's not clear from me: +1 to changing on trunk for 5.0 here -1 to changing on minor release given how

Re: Intra-project dependencies

2023-01-16 Thread Josh McKenzie
> - permanence from a git SHA no longer exists With the caveat that I haven't worked w/submodules before and only know about them from a cursory search, it looks like git-submodule status would show us the sha for submodules and we could have parent projects reference specific shas to pull for

Re: Merging CEP-15 to trunk

2023-01-16 Thread Josh McKenzie
Did we document this or is it in an email thread somewhere? I don't see it on the confluence wiki nor does a cursory search of ponymail turn it up. What was it for something flagged experimental? 1. Same tests pass on the branch as to the root it's merging back to 2. 2 committers eyes on (author

Re: Intra-project dependencies

2023-01-17 Thread Josh McKenzie
Is there any reason we couldn't "bundle" a release vote to include both an Accord release and ASF C* in one voting round as a combined release? My reading of the release process w/the ASF doesn't speak to that (if anything it implies this might be a valid approach): https://www.apache.org/legal

Re: Intra-project dependencies

2023-01-17 Thread Josh McKenzie
> Josh, bundling releases gets tricky in that you need to include the library > sources, because the cassandra release is essentially being voted on (because > it has been built) with non-released dependencies. Arguably, one shouldn't vote on a release of Accord unless there's something that's i

Re: Merging CEP-15 to trunk

2023-01-24 Thread Josh McKenzie
Zooming out a bit, I think Accord is the first large body of work we've done post introduction of the CEP system with multiple people collaborating on a feature branch like this. This discussion seems to have surfaced a few sentiments: 1. Some contributors seem to feel that work on a feature br

Re: Merging CEP-15 to trunk

2023-01-24 Thread Josh McKenzie
2 and 3, I certainly observe an assumption that contributors have > expected to review after a rebase. But I don't see this as a significant > topic to argue about. If indeed the rebase is as easy as Benedict advertised, > then we should just do the rebase because apparently it can

[ANNOUNCE] Evolving governance in the Cassandra Ecosystem

2023-01-26 Thread Josh McKenzie
The Cassandra PMC is pleased to announce that we're evolving our governance procedures to better foster subprojects under the Cassandra Ecosystem's umbrella. Astute observers among you may have noticed that the Cassandra Sidecar is already a subproject of Apache Cassandra as of CEP-1 (https://c

Cassandra project status, 2023-01-26

2023-01-26 Thread Josh McKenzie
After a bit of time away, I'm ready to regale you with tales of things you've already seen on the dev list and JIRA. ;) Let's start with calling out that registrations for the Cassandra Summit are open. Patrick did a better job than I ever could summarizing this in his email poetically titled "

Re: [ANNOUNCE] Evolving governance in the Cassandra Ecosystem

2023-01-27 Thread Josh McKenzie
your work on this! > > Supportive of the changes and grateful to have scaffolding in place to > accommodate current/incoming subprojects. > > – Scott > >> On Jan 26, 2023, at 1:21 PM, Josh McKenzie wrote: >> >> >> The Cassandra PMC is pleased to announce

Re: Merging CEP-15 to trunk

2023-01-27 Thread Josh McKenzie
een the case also for the Accord >>>>> branch at some point. After all, if it had been ready to merge to trunk >>>>> already a year ago, why wasn't it? It's kind of the point of using a >>>>> feature branch that the code in it is NOT ready

Re: [DISCUSSION] Framework for Internal Collection Exposure and Monitoring API Alignment

2023-01-28 Thread Josh McKenzie
First off - thanks so much for putting in this effort Maxim! This is excellent work. Some thoughts on the CEP and responses in thread: > *Considering that JMX is usually not used and disabled in production > environments for various performance and security reasons, the operator may > not see

Re: Merging CEP-15 to trunk

2023-01-31 Thread Josh McKenzie
> Don't we follow a principle of always shippable trunk? This was actually a > reason why I sidelined the talk about post-merge review, because it implies > that the code wasn't "good enough"/perfect when it was first merged. We follow a principle of "always shippable trunk according to circleci"

Re: Merging CEP-15 to trunk

2023-01-31 Thread Josh McKenzie
ng to suss out subtle timing issues at this time. So calling what we're doing on the ASF side "always shippable trunk" is leaving a lot of lift and toil up to folks working on these tags on their own infra that goes into each release. Which vexes me. On Tue, Jan 31, 2023, at 11:20 AM

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2023-02-02 Thread Josh McKenzie
Things I think of as API's: 1. nodetool output (user tooling couples with this) 2. CQL syntax 3. JMX 4. VTables 5. Potential future refactored and deliberately exposed API interfaces (SSTables, custom indexes, etc) API's persist; I don't think lazy consensus to favor velocity is the right tradeo

Re: [DISCUSS] API modifications and when to raise a thread on the dev ML

2023-02-02 Thread Josh McKenzie
> if a patch adds, say, a single JMX method to expose the > metric, having an ML thread for it may seem redundant My fear is someone missing that there's an idiom or pattern within the codebase for metrics they miss then we end up with inconsistent metric names / groups exposed to users. Especia

Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-02 Thread Josh McKenzie
Congrats Patrick! Well deserved. On Thu, Feb 2, 2023, at 5:25 PM, Molly Monroy wrote: > Congrats, Patrick... much deserved! > > On Thu, Feb 2, 2023 at 1:59 PM Derek Chen-Becker > wrote: >> Congrats! >> >> On Thu, Feb 2, 2023 at 10:58 AM Benjamin Lerer wrote: >>> The PMC members are pleased to

[DISCUSS] Merging incremental feature work

2023-02-03 Thread Josh McKenzie
The topic of how we handle merging large complex bodies of work came up recently with the CEP-15 merge and JDK17, and we've faced this question in the past as well (CASSANDRA-8099 comes to mind). The times we've done large bodies of work separately from trunk and then merged them in have their

Re: [DISCUSS] Merging incremental feature work

2023-02-03 Thread Josh McKenzie
streamline the consensus building portion of this work given our history with it. We haven't taken steps to optimize the tactical execution of it yet. On Fri, Feb 3, 2023, at 7:09 AM, Brandon Williams wrote: > On Fri, Feb 3, 2023 at 6:06 AM Josh McKenzie wrote: > > > > My curre

Re: [DISCUSS] Merging incremental feature work

2023-02-03 Thread Josh McKenzie
g to trunk until the work as > a whole is useable and meets all the existing bars for quality, review and > the like. > > >> On 3 Feb 2023, at 12:43, Josh McKenzie wrote: >> >> Anything we either a) have to do (JDK support) or b) have all agreed up >>

Re: [DISCUSS] Merging incremental feature work

2023-02-03 Thread Josh McKenzie
t;> It helps to realize that the key difference between a big decision and a >> small one is whether you can fix your decision afterwards. *Any decision can >> be made small by just always making sure that if you were wrong (and you >> _will_ be wrong), you can always undo the d

Re: Implicitly enabling ALLOW FILTERING on virtual tables

2023-02-03 Thread Josh McKenzie
> they would start to set ALLOW FILTERING here and there in order to not think > twice about their data model so they can just call it a day. Setting this on a per-table basis or having users set this on specific queries that hit tables and forgetting they set it are 6 of one and half-a-dozen of

Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-06 Thread Josh McKenzie
+1 On Mon, Feb 6, 2023, at 2:53 PM, Dinesh Joshi wrote: > +1 > >> >> On Feb 6, 2023, at 8:16 AM, Sam Tunnicliffe wrote: >>  >> Hi everyone, >> >> I would like to start a vote on this CEP. >> >> Proposal: >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional+Cluster

Re: [VOTE] Release Apache Cassandra 4.0.8

2023-02-13 Thread Josh McKenzie
+1 On Fri, Feb 10, 2023, at 3:13 AM, Tommy Stendahl via dev wrote: > +1 (nb) > > -Original Message- > *From*: Berenguer Blasi > > *Reply-To*: dev@cassandra.apache.org > *To*: dev@cassandra.apache.org > *Subject*: Re: [VOTE] Relea

Re: Downgradability

2023-02-22 Thread Josh McKenzie
> why not implement backwards write compatibility? +1 to this from a philosophical perspective. Keeping prior releases completely in the dark about new release sstable formats is a clean approach, and we should already have the code around to ser/deser the prior version's data on the next versio

Cassandra project status, 2023-03-02

2023-03-02 Thread Josh McKenzie
Trying out a monthly-ish cadence as traffic over the holiday season was a bit sparse. Let's see how this goes. Congratulations to Patrick McFadin on being made a committer on the project! So many of us on the project have benefited from your assistance, input, guidance, and efforts over the yea

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-04 Thread Josh McKenzie
(for convenience sake, I'm referring to both Major and Minor semver releases as "major" in this email) > The big feature from our perspective for 5.0 is ACCORD (CEP-15) and I would > advocate to delay until this has sufficient quality to be in production. This approach can be pretty unpredictab

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Josh McKenzie
Added an "Epics" quick filter; could help visualize what our high priority features are for given releases: https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2649 Our cumulative flow diagram of 5.0 related tickets is pretty large. Probably not a great indicator for

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Josh McKenzie
> Personally, I'd like to see the fix for this issue come after CEP-21. It > could be feasible to implement a fix before then, that detects bit-errors on > the read path and refuses to respond to the coordinator, implicitly having > speculative execution handle the retry against another replica

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-09 Thread Josh McKenzie
> We do have the metadata, but yes it requires some work… My wording was poor; we have the *potential* to have this metadata, but to my knowledge we don't have a muscle of consistently setting this, or any kind of heuristic to determine when something should block a release or not. At least on 4

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Josh McKenzie
CEP-21 makes this sequencing safe, and provides abstractions to better expose > this information to operators. > > -- > Abe > >> On Mar 9, 2023, at 10:55 AM, Josh McKenzie wrote: >> >>> Personally, I'd like to see the fix for this issue come afte

Re: [DISCUSS] New dependencies with Chronicle-Queue update

2023-03-13 Thread Josh McKenzie
> I think we should we use the most recent versions of all libraries where > possible?” To clarify, are we talking "most recent versions of all libraries *when we have to update them anyway for a dependency*"? Not *all libraries all libraries*... If the former, I agree. If the latter, here be dr

Re: Should we cut some new releases?

2023-03-14 Thread Josh McKenzie
+1 On Tue, Mar 14, 2023, at 7:50 AM, Aleksey Yeshchenko wrote: > +1 > >> On 14 Mar 2023, at 05:50, Berenguer Blasi wrote: >> >> +1 >> >> On 13/3/23 21:25, Jacek Lewandowski wrote: >>> +1 >>> >>> pon., 13 mar 2023, 20:36 użytkownik Miklosovic, Stefan >>> napisał: Yes, I was waiting for

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-14 Thread Josh McKenzie
It's always seemed a little odd to me that we drop all the "read old format" code given how little maintenance that code takes over time. The ability to have a C* node read older format SSTables into perpetuity *seems* like a pretty compelling usability feature to me (for some of the reasons men

Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-16 Thread Josh McKenzie
We could also consider augmenting the tool with new named arguments with the functionality you described and leave the positional usage intact. On Thu, Mar 16, 2023, at 6:43 AM, Bowen Song via dev wrote: > The documented command options are: > >> nodetool tablehistograms [ | ] >> > > > That

Re: [DISCUSS] CEP-26: Unified Compaction Strategy

2023-03-17 Thread Josh McKenzie
Could we get a JIRA for this too so we can get some reviewers collaborating on this? Only see Lorina's ticket for documenting it in JIRA atm. On Fri, Mar 17, 2023, at 9:53 AM, Branimir Lambov wrote: > The prototype of UCS can now be found in this pull request: > https://github.com/apache/cassand

Re: [VOTE] Release Apache Cassandra 4.1.1

2023-03-17 Thread Josh McKenzie
+1 On Fri, Mar 17, 2023, at 12:18 PM, Aleksey Yeshchenko wrote: > +1 > >> On 17 Mar 2023, at 13:54, Mick Semb Wever wrote: >> >>> The vote will be open for 72 hours (longer if needed). Everyone who has >>> tested the build is invited to vote. Votes by PMC members are considered >>> binding. A

Re: [DISCUSS] Drop support for sstable formats m* (in trunk)

2023-03-17 Thread Josh McKenzie
> we (including me) have done a lot of stupid shit over the years on this > project. Half the time “this is how we’ve historically done X” to me is a > strong argument to start doing things differently. Oof. The truth (when applied to myself) hurts doesn't it? :) > I suggest we should have a wa

Cassandra project status, 2023-03-20

2023-03-20 Thread Josh McKenzie
I did say monthly-ish. That goes both earlier and later. We've had a lot of interesting topics come up on the dev list in the past few weeks as well as movement on Accord, Transactional Metadata, and SAI, so let's get to it. The Cassandra Forward event took place on March 14th with a lot of int

Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-22 Thread Josh McKenzie
ms will be print out.// >> this is *another one of the old way* of using tablehistogram. >> >> So we add some more options like "-i", "-ks", "-tbs" , we can combine these >> options and we can also use any of them individually, besides, we can also &g

Re: Welcome our next PMC Chair Josh McKenzie

2023-03-23 Thread Josh McKenzie
; wrote: >>> Congratulations, Josh! >>> >>> On Thu, Mar 23, 2023, 4:23 AM Mick Semb Wever wrote: >>>> It is time to pass the baton on, and on behalf of the Apache Cassandra >>>> Project Management Committee (PMC) I would like to welcome and

Apache TAC: assistance for travel to Berlin Buzzwords

2023-03-24 Thread Josh McKenzie
Cassandra Community! The Travel Assistance Committee with the Apache Foundation is supporting travel to Berlin Buzzwords 2023 (https://2023.berlinbuzzwords.de, 18-20 June 2023) for up to 6 people. This conference has lined up pretty well with our project in the past and would probably be a grea

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread Josh McKenzie
> making sure that joining and leaving nodes update some state via Paxos > instead of via gossip What kind of a time delivery risk does coupling CEP-15 with CEP-21 introduce (i.e. unk-unk on CEP-21 leading to delay cascades to CEP-15)? Seems like having a table we CAS state for on epochs wouldn'

Re: [EXTERNAL] Re: [DISCUSS] Next release date

2023-03-24 Thread Josh McKenzie
> I would like to propose a partial freeze of 5.0 in June My .02: +1 to: * partial freeze on an agreed upon date w/agreed upon other things that can optionally go in after * setting a hard limit on when we ship from that frozen branch regardless of whether the features land or not -1 to: * ever

Re: [EXTERNAL] [DISCUSS] Next release date

2023-03-24 Thread Josh McKenzie
ime” really buys >>>>> us? I wouldn’t trust any multi-node QA done pre commit. >>>>> What “stabilizing” do we expect to be doing during this time? How much >>>>> of it do we just have to do again after those things merge? I for one do >>>&

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread Josh McKenzie
rizable epochs in lieu of TCM? > > FWIW, I'd still rather just integrate w/ TCM ASAP, avoiding integration risk > while accepting the possible delivery risk. > > On Fri, Mar 24, 2023 at 9:32 AM Josh McKenzie wrote: >> __ >>> making sure that joining and l

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread Josh McKenzie
is “fine”… > current world is hard to use and brittle (users need to tell accord that the > cluster changed), but if accord is rebasing on txn metadata then this won’t > be that way long (currently blocked from doing that due to txn metadata not > passing all tests yet). >

Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-03-26 Thread Josh McKenzie
I want to second what Yifan's spoken to, specifically in terms of resource isolation and availability. While the sidecar hasn't seen a ton of traffic and contributions since the acceptance into the project and clearance of CEP-1, my intuition is that that's due to the entrenched maturity of alt

Re: [EXTERNAL] Re: Cassandra CI Status 2023-01-07

2023-03-27 Thread Josh McKenzie
I'll take build lead for the next 2 weeks. On Sat, Mar 25, 2023, at 4:50 PM, Mick Semb Wever wrote: >> Here comes Cassandra CI status for 2023-3-13 - 2023-23-179 : >> >> *** CASSANDRA-18338 >> - dtest.bootstrap_test.TestBootstrap.test_cle

Re: [EXTERNAL] [DISCUSS] Next release date

2023-03-30 Thread Josh McKenzie
; questions/overhead. It could make sense to me to branch branch when CEP-21 >>> merges and only let in CEP-15 after that. CEP-15 is mostly “net new stuff” >>> and not “changes to existing stuff” from my understanding? So no QA effort >>> wasted if it is done before it me

Re: [EXTERNAL] [DISCUSS] Next release date

2023-04-01 Thread Josh McKenzie
> in practice we wait and receive bug reports from downstream testing efforts. > Such testing isn't necessarily possible pre-commit, e.g. third-party and not > feasible to continuously run, nor appropriate to upstream/open-source. > > We want GA releases to be production ready for any cluster at

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-04 Thread Josh McKenzie
I think there's competing dynamics here. 1) KEYSPACE isn't that great of a name; it's not a space in which keys are necessarily unique, and you can't address things just by key w/out their respective tables 2) DATABASE isn't that great of a name either due to the aforementioned ambiguity. Some

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Josh McKenzie
> KEYSPACE is fine. If we want to introduce a standard nomenclature like > DATABASE that’s also fine. Inventing brand new ones is not fine, there’s no > benefit. I'm with Benedict in principle, with Aleksey in practice; I think KEYSPACE and SCHEMA are actually fine enough. If and when we get to

Re: [VOTE] CEP-26: Unified Compaction Strategy

2023-04-06 Thread Josh McKenzie
+1 On Thu, Apr 6, 2023, at 12:18 PM, Joseph Lynch wrote: > +1 > > This proposal looks really exciting! > > -Joey > > On Wed, Apr 5, 2023 at 2:13 AM Aleksey Yeshchenko wrote: > > > > +1 > > > > On 4 Apr 2023, at 16:56, Ekaterina Dimitrova wrote: > > > > +1 > > > > On Tue, 4 Apr 2023 at 11:44,

Re: [VOTE] Release Apache Cassandra 4.0.9 - SECOND ATTEMPT

2023-04-13 Thread Josh McKenzie
+1 On Thu, Apr 13, 2023, at 3:17 AM, Benjamin Lerer wrote: > +1 > > Le jeu. 13 avr. 2023 à 08:56, Tommy Stendahl via dev > a écrit : >> +1 (nb) >> >> -Original Message- >> *From*: Brandon Williams > > >> *Reply-To*: dev@cassandra.apac

Re: (CVE only) support for 3,11 beyond published EOL

2023-04-13 Thread Josh McKenzie
> We already have an understanding and precedence in place that CVEs on > the previous unmaintained branch are addressed and released. Correct me if I'm wrong German, but the question I got from your email was effectively "If we consider formalizing our commitment to fixing CVE's on older branch

Re: [DISCUSS] Next release date

2023-04-16 Thread Josh McKenzie
> 2. When CEP-15 lands we cut alpha1, > 2a. The deadline is first week of October, anything not yet in > cassandra-5.0 is not in 5.0, > 2b. We expect a minimum two months of testing and beta+rc releases > to get to GA. To clarify, is the intent here to say "The deadline for cutoff is 1st we

Re: [EXTERNAL] Re: Cassandra CI Status 2023-01-07

2023-04-17 Thread Josh McKenzie
wn to ~ 6 failures right now. On Mon, Mar 27, 2023, at 12:27 PM, Josh McKenzie wrote: > I'll take build lead for the next 2 weeks. > > On Sat, Mar 25, 2023, at 4:50 PM, Mick Semb Wever wrote: >>> Here comes Cassandra CI status for 2023-3-13 - 2023-23-179 : >>

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
So to bring us back to the goals and alignment here: > With the following intentions: > - moving towards the goal of annual releases, with a cadence 12±3 months > apart, > - the branch to GA period being 2-3 months, > - avoiding any type of freeze on trunk, > - getting a release out by December's

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
ot;freeze" in this regard. On Mon, Apr 17, 2023, at 3:06 PM, Josh McKenzie wrote: > So to bring us back to the goals and alignment here: > >> With the following intentions: >> - moving towards the goal of annual releases, with a cadence 12±3 months >> apart, >

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
> it's (b) for me, and everything minus 21 and 15 is defining enough to warrant > the branching and a checkpoint where testing can start Ok, I don't follow. There's three different ways I can read what you're saying here: 1. "Everything we have targeting 5.x is substantial and we can branch when

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
means we branch there and anything not already merged has to wait > > > On Mon, Apr 17, 2023 at 3:37 PM Josh McKenzie wrote: >> __ >>> it's (b) for me, and everything minus 21 and 15 is defining enough to >>> warrant the branching and a checkpoint where tes

Re: [DISCUSS] Next release date

2023-04-17 Thread Josh McKenzie
> If this is true, why do we even bother running any CI before the CEP-21 > merge? It will all be invalidated anyway, right? I'm referring to manual validation or soak testing in qa environments rather than automated. Just because a soft-frozen branch without those features works in QA doesn't m

Re: [DISCUSS] [PATCH] Enable Direct I/O For CommitLog Files

2023-04-18 Thread Josh McKenzie
I took the liberty of creating https://issues.apache.org/jira/browse/CASSANDRA-18464 linking to this email thread w/the contents of your email and applying the patch to that ticket. Probably want to have some lower level discussions there when we find you a reviewer. On Tue, Apr 18, 2023, at 2

Re: [DISCUSS] Next release date

2023-04-19 Thread Josh McKenzie
Let me try to break this down another way: I see a few competing concerns, each with QA related time requirements (asserting 8 weeks minimum, 16 weeks maximum we should plan for to stabilize a GA): 1. A freeze to a branch to stabilize for release (8-16 weeks of QA required after we branch) 2.

Cassandra project status, 2023-04-25

2023-04-25 Thread Josh McKenzie
We have a town hall coming up! The URL for the meetup can be found here: https://www.meetup.com/cassandra-global/events/292858262/. This will be held tomorrow at 12pm EST. Jon Haddad (https://www.linkedin.com/in/rustyrazorblade/) will be discussing performance tuning on Apache Cassandra, I'll b

Re: Adding vector search to SAI with heirarchical navigable small world graph index

2023-04-25 Thread Josh McKenzie
To be fair Dinesh kind of primed that: > Do you intend to make this part of CEP-7 or as an incremental update to SAI > once it is committed? ;) I think this body of work more than stands on its own. Great work Jonathan, Mike, and Zhao; having native support for more ML-oriented workloads in C*

Re: [DISCUSS] New data type for vector search

2023-04-27 Thread Josh McKenzie
>From a machine learning perspective, vectors are a well-known concept that are >effectively immutable fixed-length n-dimensional values that are then later >used either as part of a model or in conjunction with a model after the fact. While we could have this be non-frozen and not call it a vec

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Josh McKenzie
> If we want to make an ML-specific data type, it should be in an ML plug-in. How can we encourage a healthier plug-in ecosystem? As far as I know it's been pretty anemic historically: cassandra: https://cassandra.apache.org/doc/latest/cassandra/plugins/index.html postgres: https://www.postgresql

Re: [POLL] Vector type for ML

2023-05-05 Thread Josh McKenzie
Idiomatically, to my mind, there's a question of "what space are we thinking about this datatype in"? - In the context of mathematics, nullability in a vector would be 0 - In the context of Cassandra, nullability tends to mean a tombstone (or nothing) - In the context of programming languages, i

Re: [VOTE] CEP-29 CQL NOT Operator

2023-05-09 Thread Josh McKenzie
+1 On Tue, May 9, 2023, at 2:42 PM, Patrick McFadin wrote: > +1 > > On Tue, May 9, 2023 at 10:58 AM Caleb Rackliffe > wrote: >> +1 >> >> On Tue, May 9, 2023 at 12:04 PM Piotr Kołaczkowski >> wrote: >>> Let's vote. >>> >>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+CQL+N

[DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-16 Thread Josh McKenzie
Similar to what we've done with accord in https://issues.apache.org/jira/browse/CASSANDRA-18204, I'd like to discuss bringing cassandra-harry in-tree as a submodule. repo link: https://github.com/apache/cassandra-harry Given the value it's brought to the project's stabilization efforts and the

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
CEP-N seems like a good compromise. NextMajorRelease bumps into our interchangeable use of "Major" and "Minor" from a semver perspective and could get confusing. Suppose we could do NextFeatureRelease, but at that point why not just have it linked to the CEP and have the epic set. On Thu, May 1

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
lieu of that, every ticket targeting 5.0 could use fixVersion 5.0.x, > since it is pretty clear what this means. Some tickets that don’t hit 5.0.0 > can then be postponed to a later version, but it’s not like this is > burdensome. Anything marked feature/improvement and 5.0.x gets bumped

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
we just need to get 5.0-alpha1 > labels added when those releases are cut. > > Then I propose we break the confusion in both directions by scrapping 5.0 > entirely and introducing 5.0-target. > > So tickets go to 5.0-target if they target 5.0, and to 5.0.0 once they are

Re: [DISCUSS] Feature branch version hygiene

2023-05-18 Thread Josh McKenzie
h D Jordan wrote: > So what do we do with feature branch merged tickets in this model? *They > stay on 5.0-target after close and move to 5.0.0 when the epic is merged and > closes*? > >> On May 18, 2023, at 9:33 AM, Josh McKenzie wrote: >> >>> My mental model, thou

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-23 Thread Josh McKenzie
a released JAR. We can then reference Harry as >> a library without maintaining public artifacts for it. Is that in line with >> what you're thinking? >> >> > I'd also like to see us get a Harry run integrated as part of our >> > pre-commit CI >> &

Re: Vector search demo, and query syntax

2023-05-24 Thread Josh McKenzie
+1 to the flow of: 1: ORDER BY? 2: Oh. Yeah. That *does *makes sense. ;) (sending from fastmail in the hopes the image doesn't get stripped. Thanks ASF smtp server...) ~Josh On Wed, May 24, 2023, at 1:00 AM, Jeremiah D Jordan wrote: > At first I wasn’t sure about using ORDER BY, but the mor

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Josh McKenzie
r few hours, where we >> could have cut many manual module releases in that time. >> >> David and folks working on accord ? >> >> >> >> On Tue, 23 May 2023 at 20:09, Josh McKenzie wrote: >>> __ >>> I'll hold off on this un

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread Josh McKenzie
t;> I would not set this improvement as a prerequisite to pulling Harry into the >> main branch, but rather interpret it as a commitment from myself to take >> community input and make it more approachable by the day. >> >> On Wed, May 24, 2023, at 2:44 PM, Josh McKenzie wro

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-25 Thread Josh McKenzie
;>>> >>>>>> > We could go over some interesting examples such as testing 2i (SAI) >>>>>> >>>>>> +100 >>>>>> >>>>>> >>>>>> On Wed, May 24, 2023 at 1:40 PM Alex Petrov wrote: &g

Re: [VOTE] CEP-30 ANN Vector Search

2023-05-25 Thread Josh McKenzie
+1 On Thu, May 25, 2023, at 8:33 PM, Jake Luciani wrote: > +1 > > On Thu, May 25, 2023 at 11:45 AM Jonathan Ellis wrote: >> Let's make this official. >> >> CEP: >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor%28ANN%29+Vector+Search+via+Storage-At

Re: [DISCUSS] CEP-8 Drivers Donation - take 2

2023-05-30 Thread Josh McKenzie
> Is the vote for the CEP to be for all drivers, but we will introduce each > driver one by one? What determines when we are comfortable with one driver > subproject and can move on to accepting the next ? Curious to hear on this as well. There's 2 implications from the CEP as written: 1. The

Cassandra project status, 2023-05-30

2023-05-30 Thread Josh McKenzie
Been a bit over a month; let's check in and see how things are looking. We released the following: - 3.11.15 - 3.0.29 - 4.0.10 - 4.1.2 Thanks to all the release managers who worked on getting these out the door. [New Contributors Getting Started] First off, come hang out with us in the #cassand

Re: Is simplenative in cassandra-stress still relevant?

2023-05-31 Thread Josh McKenzie
> The main issue I see with maintaining the SimpleClient in cassandra-stress is > the burden it puts on a user to understand the options available when > connecting with *-mode*: How frequently do we expect users or devs to use the built-in cassandra-stress tool? Between tlp-stress and NoSQLBenc

Re: Is simplenative in cassandra-stress still relevant?

2023-05-31 Thread Josh McKenzie
hots here, if a community decides it has to go > so it will but I would be said to see it. > > Regards > > > > From: Josh McKenzie > Sent: Wednesday, May 31, 2023 15:15 > To: dev > Subject: Re: Is simplenative in cassandra-stress still relevant?

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-31 Thread Josh McKenzie
reason as the JVM dtests. It's nice to write a feature or fix, find a >> similar JVM dtest, copy, paste, and edit, and have something useful. >> >> 3. General subdivision of Cassandra projects >> >> This topic has come up quite a few times recently - aroun

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread Josh McKenzie
till calling >> out as it has been an issue. >> >> Josh, do you see any reports on what isn’t working? I think most people >> don’t touch 1% of what git can do… so it might be that 10% is broken but >> that no one in our domain actually touches that path?

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
> I do not have in mind a scenario where it could be useful to specify a LIMIT > in bytes. The LIMIT clause is usually used when you know how many rows you > wish to display or use. Unless somebody has a useful scenario in mind I do > not think that there is a need for that feature. If you have

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
Yeah, my bad. I have paging on the brain. Seriously. I can't think of a use-case in which a LIMIT based on # bytes makes sense from a user perspective. On Mon, Jun 12, 2023, at 1:35 PM, Jeff Jirsa wrote: > > > On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer wrote: >>> If you have rows that var

Re: [DISCUSS] Limiting query results by size (CASSANDRA-11745)

2023-06-12 Thread Josh McKenzie
part of the user query, I think the server must always > have returned all data that fits into the LIMIT when all pages have been > returned. > > -Jeremiah > > On Jun 12, 2023 at 12:56:14 PM, Josh McKenzie wrote: >> >> Yeah, my bad. I have paging on the brain. S

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-13 Thread Josh McKenzie
> have subsequently been deprecated since 4.1-alpha in CASSANDRA-17195 when > they were replaced/migrated to guardrails as part of CEP-3 (Guardrails). Have we been dropping support entirely for old params or using the @Replaces annotation into perpetuity? I dislike the idea of operators having t

Re: [VOTE] CEP-8 Datastax Drivers Donation

2023-06-13 Thread Josh McKenzie
+1 On Tue, Jun 13, 2023, at 10:55 AM, Jeremiah Jordan wrote: > +1 nb > > On Jun 13, 2023 at 9:14:35 AM, Jeremy Hanna > wrote: >> >> Calling for a vote on CEP-8 [1]. >> >> To clarify the intent, as Benjamin said in the discussion thread [2], the >> goal of this vote is simply to ensure that t

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-13 Thread Josh McKenzie
7; and 'table_count_warn_threshold' >>> > configuration settings on the trunk branch for the next major release. >>> >>> Deprecate in 4.1 is way too new for me to accept that, and its low effort >>> to keep; breaking users is always a bad idea and

  1   2   3   4   5   6   7   >