Re: [DISCUSSION] New dependencies for SAI CEP-7

2022-12-12 Thread David Capwell
> com.carrotsearch.randomizedtesting.randomizedtesting-runner 2.1.2 - test > dependency Can you talk more about why? There are several ways to do random testing in-tree ATM, so wondering why we need another one > On Dec 8, 2022, at 6:51 AM, Mike Adamson wrote: > > Hi, > > I wanted to discu

Re: [DISCUSSION] New dependencies for SAI CEP-7

2022-12-13 Thread David Capwell
bove library was only added to provide us with a rich set of > random generators. I am happy to look at removing this library if its > inclusion is contentious. > > > On Mon, 12 Dec 2022 at 19:41, David Capwell <mailto:dcapw...@apple.com>> wrote: >> com.carrot

Re: [DISCUSSION] New dependencies for SAI CEP-7

2022-12-14 Thread David Capwell
> As such I would prefer to keep using the carrotsearch generators Works for me; I am cool with the added test dependency. > On Dec 14, 2022, at 7:13 AM, Mike Adamson wrote: > > I have had a look at whether we could use the QuickTheories in our randomized > testing and come to the following co

Re: Issue when creating a stream session

2022-12-16 Thread David Capwell
This sounds like a bug to me, but would be good to get feedback from others who have touched Streaming… Repair will fail if membership notifies that a participate node was removed, so I think it makes sense for Streaming to also follow this behavior. > On Dec 14, 2022, at 1:22 PM, Natnael Adere

Re: [DISCUSSION] Cassandra's code style and source code analysis

2022-12-22 Thread David Capwell
Im good with 3 and 4. > On Dec 22, 2022, at 10:41 AM, Derek Chen-Becker wrote: > > I vote for #4. I've always used the convention of having stdlib stuff > first, external stuff second, and same-project imports last. I guess > increasing order of specificity? > > Happy Holidays! > > Derek > >

Re: Introducing mockito-inline library among test dependencies

2023-01-11 Thread David Capwell
+1. We already use mockito. Also that library is basically empty, its just defining configs for extensions (see https://github.com/mockito/mockito/tree/main/subprojects/inline/src/main/resources/mockito-extensions

Re: Should we change 4.1 to G1 and offheap_objects ?

2023-01-12 Thread David Capwell
I am cool with updating NEWS in 4.1.1 to recommend the change and change it in 4.2/5.0 > On Jan 12, 2023, at 10:56 AM, Josh McKenzie wrote: > > Potential compromise: We change it in trunk, and we NEWS.txt in the minor > about that change in trunk, why, and recommend users consider qualifying t

Re: Intra-project dependencies

2023-01-18 Thread David Capwell
Been out, sorry for just catching up now… I feel this thread pidgin hold on the word Accord and ignored the fact we are dealing with this pain today with python/jvm dtest and trying to improve that would help the project…. We also have other related projects that we are developing in parallel t

Re: Intra-project dependencies

2023-01-19 Thread David Capwell
Thanks for the reply, my replies are inline to your inline replies =D > On Jan 19, 2023, at 2:39 PM, Mick Semb Wever wrote: > > > Thanks David for the detailed write up. Replies inline… > > > We tried in-tree for in-jvm dtest and found that this broke every other > commit… maintaining the A

Re: [DISCUSSION] Framework for Internal Collection Exposure and Monitoring API Alignment

2023-01-26 Thread David Capwell
I took a look and I see the result is an interface that looks like the vtable interface, that is then used by vtables and JMX? My first thought is why not just use the vtable logic? I also wonder about if we should care about JMX? I know many wish to migrate (its going to be a very long time)

Re: Merging CEP-15 to trunk

2023-01-27 Thread David Capwell
> I've learned that when I have defended the need (or right, if appealing to > the Governance texts...) for contributors to be able to review a feature > branch at the time it is merged to trunk - which for Accord is now - that a > common reaction to this is that doing a review of Accord now mig

Re: [DISCUSSION] Framework for Internal Collection Exposure and Monitoring API Alignment

2023-01-30 Thread David Capwell
gt; >> Not to open the Pandora box, but to me the most important thing here is to >> come into agreement about the future of JMX and what we will do or not as a >> community. Also, how much time people are able to invest. I guess this will >> influence any directions to be taken here.

Re: Intra-project dependencies

2023-01-30 Thread David Capwell
I took a stab at creating a patch that I think addresses most of the comments I saw in this thread, would love feedback in https://issues.apache.org/jira/browse/CASSANDRA-18204 Given that the leading solution is git submodules I went down

Re: Merging CEP-15 to trunk

2023-01-30 Thread David Capwell
> Does this mean there have also been nightly jenkins builds running? Is there > a history of such test results visible somewhere? If yes, I think that lends > a lot of credibility to the claim the process was as rigorous as it is for > trunk, and looking at the build history for a few minutes s

Re: Merging CEP-15 to trunk

2023-01-30 Thread David Capwell
sts should be run before merge. There are examples of Jenkins only tests that are not run, but again this is due to existing limitations with Jenkins. > On Jan 30, 2023, at 3:33 PM, Henrik Ingo wrote: > > On Tue, Jan 31, 2023 at 1:28 AM David Capwell <mailto:dcapw...@apple.com>

Re: Merging CEP-15 to trunk

2023-01-31 Thread David Capwell
e that I did $ git show f8243f41c9e96c4a0390558086ece078b0b6b15c commit f8243f41c9e96c4a0390558086ece078b0b6b15c Author: David Capwell Date: Mon Jan 9 13:20:58 2023 -0800 Ninja: Add AccordTestUtils.parse which was missing in the latest commit diff --git a/test/unit/org/apache/cassandra/service/accord

Re: Merging CEP-15 to trunk

2023-02-01 Thread David Capwell
> It's been mentioned "always shippable trunk according to circleci". That's > not true: we are always shippable according to *either* CI. There are folk > just using ci-cassandra for our pre-commit gateway. It is important that you > don't trash the other CI system, particularly when it come

Re: Welcome Patrick McFadin as Cassandra Committer

2023-02-02 Thread David Capwell
Congrats and welcome! =) > On Feb 2, 2023, at 10:53 AM, J. D. Jordan wrote: > > Congrats! > >> On Feb 2, 2023, at 12:47 PM, Christopher Bradford >> wrote: >> >>  >> Congrats Patrick! Well done. >> >> On Thu, Feb 2, 2023 at 10:44 AM Aaron Ploetz > > wrote: >>>

Re: Implicitly enabling ALLOW FILTERING on virtual tables

2023-02-03 Thread David Capwell
> I don't think the assumption that "virtual tables will always be small and > always fit in memory" is a safe one. Agree, there is a repair ticket to have the coordinating node do network queries to peers to resolve the table (rather than operator querying everything, allow the coordinator nod

Re: [VOTE] CEP-21 Transactional Cluster Metadata

2023-02-07 Thread David Capwell
+1 > On Feb 7, 2023, at 7:15 AM, Jeremiah D Jordan > wrote: > > +1 nb > >> On Feb 6, 2023, at 10:15 AM, Sam Tunnicliffe wrote: >> >> Hi everyone, >> >> I would like to start a vote on this CEP. >> >> Proposal: >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-21%3A+Transactional

Re: [DISCUSS] Moving system property names to the CassandraRelevantProperties

2023-02-09 Thread David Capwell
> All properties meant to be used only for tests would have a prefix like > "cassandra.test.name.of.property" and production properties would be > "cassandra.xyx". Once this is done, we can filter them out in vtable so there > would not be any test-related properties in production. Test properti

Re: [DISCUSS] Moving system property names to the CassandraRelevantProperties

2023-02-09 Thread David Capwell
g a good question! Sure, a new checkstyle rule was added > to address this case for production and test classes. > > On Thu, 9 Feb 2023 at 19:40, David Capwell wrote: >> >> All properties meant to be used only for tests would have a prefix like >> "cassa

Re: Intra-project dependencies

2023-02-16 Thread David Capwell
After a lot of effort I think this branch is in a good state, accord feels mostly like its in-tree and all the complexity of git is hidden mostly. I would love more feedback as the patch is in a usable state > On Jan 30, 2023, at 3:16 PM, David Capwell wrote: > > I took a stab at c

Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration

2023-02-22 Thread David Capwell
I guess back to the point of the thread, we need a way to know what configs are mutable for the settings virtual table, so need some way to denote that the config replica_filtering_protection.cached_rows_fail_threshold is mutable. Given the way that the yaml config works, we can’t rely on the p

Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration

2023-03-01 Thread David Capwell
approaches - the annotations one seems most reasonable >> to me and I didn’t have the chance to consider any others. Volatile seems >> fragile and unclear as a differentiator. I agree >> >> On Tue, 28 Feb 2023 at 17:47, Maxim Muzafarov >> mailto:mmu...@apac

Re: [DISCUSS] Next release date

2023-03-01 Thread David Capwell
I am cool with defining target release date and working backwards from there. If we do want to go this route, I think we do need to answer why 4.1 cut -> release took so much time, and if people could start validation “before” we branch? If we know trunk is stable today then we could release t

Re: hsqldb test dependency in simulator code

2023-03-14 Thread David Capwell
Quickly looking I think we can switch to org.agrona.collections.Long2LongHashMap, the key isn’t the “correct” type (long when we want int) but isn’t too hard to switch. Few differences in the semantics need to be handled, but not much 1) get of non-defeiled key should throw NoSuchElementExcept

Re: Role of Hadoop code in Cassandra 5.0

2023-03-16 Thread David Capwell
Isn’t our deprecation rules that if we deprecate in 4.0.0 we can remove in 5.x, but 4.x needs to wait for 6.x? I am cool deprecating this and willing to pull into another repo if people (not me) are willing to maintain it (else just delete). > On Mar 10, 2023, at 1:13 AM, Jacek Lewandowski >

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread David Capwell
> the question we want to answer is whether or not we build a throwaway patch > for linearizable epochs If this is in a release, we then need to maintain that feature, so would be against it. If this is for testing, then I would argue the current world is “fine”… current world is hard to use a

Re: [DISCUSS] cep-15-accord, cep-21-tcm, and trunk

2023-03-24 Thread David Capwell
gt;> against it. > Isn't the argument that cep-21 provides this so we could just remove the > temporary impl and point to the new facility for this generation? > > On Fri, Mar 24, 2023, at 3:22 PM, David Capwell wrote: >>> the question we want to answer is whethe

Re: [DISCUSS] CEP-29 CQL NOT Operator

2023-04-06 Thread David Capwell
Overall I welcome this feature, was trying to use this around 1-2 months back and found we didn’t support, so glad to see it coming! From a testing point of view, I think we would want to have good fuzz testing covering complex types (frozen/non-frozen collections, tuples, udt, etc.), and rever

Re: Adding vector search to SAI with heirarchical navigable small world graph index

2023-04-24 Thread David Capwell
This work sounds interesting, I would recommend decoupling the types from the ANN support as the types require client changes and can go in now (would give a lot of breathing room to get this ready for 5.0), where as ANN depends on SAI which is still being worked on. > On Apr 22, 2023, at 1:02

Re: Adding vector search to SAI with heirarchical navigable small world graph index

2023-04-26 Thread David Capwell
> DENSE seems to just be an array? So very similar to a frozen list, but with a > fixed size? How I read the doc, DENSE = ARRAY, but knew that couldn’t be the case, so when I read the code its fixed size array…. So the real syntax was “DENSE FLOAT32[42]” Not a fan of the type naming, and feel

Re: [DISCUSS] New data type for vector search

2023-04-26 Thread David Capwell
Thanks for starting this thread! > In the initial commits and thread, this was DENSE FLOAT32. Nobody really > loved that, so we considered a bunch of alternatives, including > > - `FLOAT[N]`: This minimal option resembles C and Java array syntax, which > would make it familiar for many users. H

Re: [DISCUSS] New data type for vector search

2023-04-26 Thread David Capwell
023, at 10:50 AM, David Capwell wrote: > > Thanks for starting this thread! > >> In the initial commits and thread, this was DENSE FLOAT32. Nobody really >> loved that, so we considered a bunch of alternatives, including >> >> - `FLOAT[N]`: This minimal opti

Re: [DISCUSS] New data type for vector search

2023-04-27 Thread David Capwell
> but as you point out it has the problem of allowing nulls. If nulls are not allowed for the elements, then either we need a) a new type, or b) add some way to say elements may not be null…. As much as I do like b, I am leaning towards new type for this use case. So, to flesh out the type req

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
that composes with all Cassandra types. I can't see >>>> a reason to do this, nobody wants it, and we killed the most similar >>>> proposal in the past as wontfix. >>>> >>>> On Thu, Apr 27, 2023 at 7:49 PM Josh McKenzie wrote: >>>> Fro

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
> I think it is totally reasonable that the ANN patch (and Jonathan) is not > asked to implement on top of, or towards, other array (or other) new data > types. This impacts serialization, if you do not think about this day 1 you then can’t add later on without having to worry about migration

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
roblems with introducing such >> an alias to meet the ML crowd. >> >> Another way I think of this is >> `VECTOR FLOAT[n]` is the porcelain ML cql api, >> `NON-NULL FROZEN` and `FROZEN` and `FLOAT[n]` are the >> general-use plumbing cql apis. >> >>

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
tion is we >>>> deny the possibility of using the VECTOR keyword and bring us back to >>>> something like `NON-NULL FROZEN`. This is odd to me because >>>> `VECTOR` here can be just an alias for `NON-NULL FROZEN` while meeting the >>>> patch&#

Re: [POLL] Vector type for ML

2023-05-02 Thread David Capwell
> B) Should we introduce a type that is general purpose, and supports all > Cassandra types, so that this may be used to support ML (and perhaps other) > workloads I vote B only as well... > On May 2, 2023, at 9:02 AM, Benedict wrote: > > This is not the poll I thought we would be conducting,

Re: [POLL] Vector type for ML

2023-05-02 Thread David Capwell
> How about it, David? Did you already make this? I checked out the patch, fixed serialize/deserialize, added the constraints, then added a composeForFloat(ByteBuffer), with this the impact to the POC patch was the following 1) move away from VectorType.instance.serializer().deserialize(bb) to

Re: [POLL] Vector type for ML

2023-05-03 Thread David Capwell
ID >>>>> transaction. >>>>> >>>>> Patrick >>>>> >>>>> On Tue, May 2, 2023 at 3:27 PM Jonathan Ellis >>>> <mailto:jbel...@gmail.com>> wrote: >>>>>> I had a call with David. We agreed

Re: [POLL] Vector type for ML

2023-05-04 Thread David Capwell
My views have changed over time on syntax and I feel type[dimention] may not be the best, so it has gone lower in my own personal ranking… this is my current preference 1) DENSE [dimention] | NON NULL [dimention] 2) VECTOR 3) type[dimention] My reasoning for this order * type[dimention] looks

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
stablished term with what users would expect. No >>> surprises. >>> - Shorter ramp-up time for users. Cassandra is being modernized. >>> >>> The implementation is flexible, but the interface should empower our users >>> to be awesome. >>> >&

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
lot of content around "How to use it and not get in >>>> trouble." (I have a lot of that content already) >>>> >>>> - We don't have to explain what it is. A lot of prior art out there >>>> already [1][2][3] >>>> - We're matching an established term with what u

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
Updated Syntax Jonathan Ellis David Capwell Josh McKenzie Caleb Rackliffe Patrick McFadin Brandon Williams Mike Adamson Benedict Mick Semb Wever Derek Chen-Becker VECTOR 1 2 2 1 ? 3 2 DENSE VECTOR 2 1 ? ? type[dimension] 3 3 3 1 3 2 DENSE_VECTOR 1 NON NULL [dimention] 1

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
Sorry, DENSE_VECTOR was pointing to the wrong row, updated score Syntax Score VECTOR 16 DENSE VECTOR 11 type[dimension] 9 NON NULL [dimention] 6 VECTOR type[n] 5 DENSE_VECTOR 3 NON-NULL FROZEN 3 ARRAY 0 > On May 5, 2023, at 10:01 AM, David Capwell wrote: > > Updated > > Sy

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
conversation to remove =)… maybe defer this to JIRA as long as all parties agree in the ticket? With all votes in, this is what I see Syntax Jonathan Ellis David Capwell Josh McKenzie Caleb Rackliffe Patrick McFadin Brandon Williams Mike Adamson Benedict Mick Semb Wever Derek Chen-Becker VECTOR 1

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
If we ever add sparse vectors, we can assume that DENSE is the default and > allow to use either DENSE, SPARSE or nothing. > > Perhaps the dimension could be separated from the type, such as in > VECTOR[dimension] or VECTOR(dimension). > > On Fri, 5 May 2023 at 19:05

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
Yep, fair point…. SPARSE VECTOR better maps to NON NULL MAP > On May 5, 2023, at 11:58 AM, David Capwell wrote: > >> If we ever add sparse vectors, we can assume that DENSE is the default and >> allow to use either DENSE, SPARSE or nothing. > > I have been feeling tha

Re: [POLL] Vector type for ML

2023-05-05 Thread David Capwell
https://issues.apache.org/jira/browse/CASSANDRA-18504 > On May 5, 2023, at 12:27 PM, David Capwell wrote: > > Yep, fair point…. SPARSE VECTOR better maps to NON NULL MAP > >> On May 5, 2023, at 11:58 AM, David Capwell wrote: >> >>> If we ever add sparse vec

Re: CEP-30: Approximate Nearest Neighbor(ANN) Vector Search via Storage-Attached Indexes

2023-05-09 Thread David Capwell
Approach section doesn’t go over how this will handle cross replica search, this would be good to flesh out… given results have a real ranking, the current 2i logic may yield incorrect results… so would think we need num_ranges / rf queries in the best case, with some new capability to sort the

Re: [DISCUSS] The future of CREATE INDEX

2023-05-09 Thread David Capwell
If we assume SAI is what we should use by default for the cluster, would it make sense to allow CREATE INDEX [IF NOT EXISTS] [name] ON () But use a new yaml config that switches from legacy to SAI? default_2i_impl: sai For 5.0 we can default to “legacy” (new features disabled by default), but

Re: [DISCUSS] The future of CREATE INDEX

2023-05-10 Thread David Capwell
> Having to revert to CREATE CUSTOM INDEX sounds pretty awful, so I'd prefer > allowing USING...WITH... for CREATE INDEX I have 0 issues with a new syntax to make this more clear > just deprecating CREATE CUSTOM INDEX (at least after 5.0), but that's more or > less what my original proposal was

Re: [VOTE] CEP-29 CQL NOT Operator

2023-05-10 Thread David Capwell
+1 > On May 10, 2023, at 9:36 AM, Francisco Guerrero wrote: > > +1 (nb) > > On 2023/05/10 14:10:06 Jeremiah D Jordan wrote: >> +1 nb >> >>> On May 8, 2023, at 3:52 AM, Piotr Kołaczkowski >>> wrote: >>> >>> Let's vote. >>> >>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-29%3A+

Re: [DISCUSS] The future of CREATE INDEX

2023-05-12 Thread David Capwell
> I really dislike the idea of the same CQL doing different things based upon a > per-node configuration. > I agree with Brandon that changing CQL behaviour like this based on node > config is really not ideal. I am cool adding such a config, and also cool keeping CREATE INDEX disabled by def

Re: [DISCUSS] The future of CREATE INDEX

2023-05-15 Thread David Capwell
> [POLL] Centralize existing syntax or create new syntax? 1.) CREATE INDEX ... USING WITH OPTIONS... > [POLL] Should there be a default? (YES/NO) Yes > [POLL] What do do with the default? 3.) YAML config to override default index (legacy 2i remains the default) 4.) YAML config/guardrail

Re: CEP-30: Approximate Nearest Neighbor(ANN) Vector Search via Storage-Attached Indexes

2023-05-17 Thread David Capwell
they try to use >>>> this in a way that is fundamentally incompatible w/ the way the database >>>> scales/works. (I've done my best to call this out in all discussions >>>> around SAI over time, and there may even end up being further guardrails >&g

Re: Vector search demo, and query syntax

2023-05-23 Thread David Capwell
I am ok with the syntax, but wondering if a function maybe better than a CQL change? SELECT id, start, end, text FROM {self.keyspace}.{self.table} ORDER BY ANN(embedding, ?) LIMIT ? Not really a common syntax, but could be useful down the line > On May 23, 2023, at 12:37 AM, Mick Semb Wever wr

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-05-24 Thread David Capwell
> The time spent on getting that running has been a fair few hours, where we > could have cut many manual module releases in that time. We spent a few hours getting submodules working, and we no longer need to release for every single commit… $ git log b9025e59395f47535e4ed1fec20b1186cdb07db8.

Re: Agrona vs fastutil and fastutil-concurrent-wrapper

2023-05-25 Thread David Capwell
Agrona isn’t going anywhere due to the library being more than basic collections. Now, with regard to single-threaded collections… honestly I dislike Agrona as I always fight to avoid boxing; carrot was far better with this regard…. Didn’t look at the fastutil versions to see if they are better

Re: [VOTE] CEP-30 ANN Vector Search

2023-05-25 Thread David Capwell
+1 > On May 25, 2023, at 1:53 PM, Ekaterina Dimitrova > wrote: > > +1 > > On Thu, 25 May 2023 at 16:46, Brandon Williams > wrote: >> +1 >> >> Kind Regards, >> Brandon >> >> On Thu, May 25, 2023 at 10:45 AM Jonathan Ellis > > wrote: >> > >>

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread David Capwell
Most edge cases we have seen in Accord are working with feature branches from other authors where we use relative paths to make sure the git@ vs https:// doesn’t become a problem for CI (submodule points to https:// to work in CI, but if you do that during feature development it gets annoying to

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread David Capwell
apache. So the main issue has been when 2 authors try to work together (such as during review of a PR) > On Jun 1, 2023, at 10:15 AM, David Capwell wrote: > > Most edge cases we have seen in Accord are working with feature branches from > other authors where we use relative p

Re: [DISCUSS] Bring cassandra-harry in tree as a submodule

2023-06-01 Thread David Capwell
to me being impatient and the default > logging on git being... completely silent. =/ > > Looks like subsequent runs aren't hanging on that and are hopping right > through, so perhaps this a "first run tax" for submodule + worktree. > > On Thu, Jun 1, 2023, at 2:

Re: [VOTE] CEP-8 Datastax Drivers Donation

2023-06-13 Thread David Capwell
+1 > On Jun 13, 2023, at 7:59 AM, Josh McKenzie wrote: > > +1 > > On Tue, Jun 13, 2023, at 10:55 AM, Jeremiah Jordan wrote: >> +1 nb >> >> On Jun 13, 2023 at 9:14:35 AM, Jeremy Hanna > > wrote: >>> >>> Calling for a vote on CEP-8 [1]. >>> >>> To clarify the

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-13 Thread David Capwell
> Have we been dropping support entirely for old params or using the @Replaces > annotation into perpetuity? My understanding is that the goal is to keep things around in perpetuity unless it actively causes us harm… and with @Replaces, there tends to be no harm to keep around… Looking at ht

[DISCUSS] Remove org.apache.cassandra.io.sstable.SSTableHeaderFix in trunk (5.0)?

2023-06-13 Thread David Capwell
org.apache.cassandra.io.sstable.SSTableHeaderFix was added due to bugs in 3.6 causing invalidate types or incompatible types (due to toString changes) in the SSTableHeader… this logic runs on start and rewrites all Stats files that had a mismatch from the local schema; with 5.0 requiring upgrade

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-13 Thread David Capwell
g to use converters, we would need to know how many system > keyspaces/tables were on the version we are upgrading from. I don't know if > that information is available. Or perhaps we could assume that counting > system keyspaces/tables was a bug, and just translate changing the meaning t

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-14 Thread David Capwell
t;> On Tue, 13 Jun 2023 at 23:51, Josh McKenzie > <mailto:jmcken...@apache.org>> wrote: >> >>> Warning that too many tables (including system) may have negative behavior >>> I think is fine >> This reminds me of the current situation with our tests wh

Re: [DISCUSS] Remove org.apache.cassandra.io.sstable.SSTableHeaderFix in trunk (5.0)?

2023-06-15 Thread David Capwell
Not heard any feedback yet, so tomorrow plan to remove… the feature was local to 3.6+ so all users migrating from 3.0 to 4.0 never had this issue > On Jun 13, 2023, at 10:22 AM, David Capwell wrote: > > org.apache.cassandra.io.sstable.SSTableHeaderFix was added due to bugs in 3.6

Re: [DISCUSS] Remove deprecated keyspace_count_warn_threshold and table_count_warn_threshold

2023-06-16 Thread David Capwell
>>> subtract the current number of system keyspace/tables from the old value. >>> For example, 150 tables in the old threshold translate to 103 tables in the >>> new guardrail, considering that there are 47 system tables. >>> >>> Does this so

Re: [DISCUSS] Using ACCP or tc-native by default

2023-06-22 Thread David Capwell
+1 to ACCP > On Jun 22, 2023, at 3:05 PM, C. Scott Andreas wrote: > > +1 for ACCP and can attest to its results. ACCP also optimizes for a range of > hash functions and other cryptographic primitives beyond TLS acceleration for > Netty. > >> On Jun 22, 2023, at 2:07 PM, Jeff Jirsa wrote: >>

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-26 Thread David Capwell
> not running it automatically with the targets which devs usually run locally. The checks tend to have an opt-out, such as -Dno-checkstyle=true… so its really easy to setup your local environment to opt out what you do not care about… I feel we should force people to opt-out rather than opt-in…

Re: [DISCUSS] When to run CheckStyle and other verificiations

2023-06-27 Thread David Capwell
> nobody referred to running checks in a pre-push (or pre-commit) hook In accord I added an opt-out for each hook, and will require such here as well… as long as you can opt-out, its fine by me… I know I will likely opt-out, but wouldn’t block such an effort > Your point that pre-push hook mig

Re: [Discuss] Repair inside C*

2023-07-25 Thread David Capwell
As someone who has done a lot of work trying to make repair stable, I approve of this message ^_^ More than glad to help mentor this work > On Jul 24, 2023, at 6:29 PM, Jaydeep Chovatia > wrote: > > To clarify the repair solution timing, the one we have listed in the article > is not the rec

Re: [Discuss] Repair inside C*

2023-07-26 Thread David Capwell
oc, code, etc., that has been >>> working for us for the last six years at an immense scale, and I will share >>> it soon on a private fork. >>> >>> Thanks, >>> Jaydeep >>> >>> On Tue, Jul 25, 2023 at 9:48 AM German Eichberger via d

[DISCUSS] Add Jepsen's Elle as a test dependency for Accord / Paxos

2023-09-13 Thread David Capwell
For validation of Paxos and Accord 2 different consistency verifiers were created: accord.verify.StrictSerializabilityVerifier (Accord), and org.apache.cassandra.simulator.paxos.LinearizabilityValidator (Paxos). To increase confidence in both protocols it would be good to use an external consi

Re: [DISCUSS] Vector type and empty value

2023-09-19 Thread David Capwell
> When we introduced TINYINT and SMALLINT (CASSANDRA-895) we started making > types non -emptiable. This approach makes more sense to me as having to deal > with empty value is error prone in my opinion. I agree it’s confusing, and in the patch I found that different code paths didn’t handle th

[DISCUSS] Backport CASSANDRA-18816 to 5.0? Add support for repair coordinator to retry messages that timeout

2023-09-19 Thread David Capwell
To try to get repair more stable, I added optional retry logic (patch is still in review) to a handful of critical repair verbs. This patch is disabled by default but allows you to opt-in to retries so ephemeral issues don’t cause a repair to fail after running for a long time (assuming they re

Re: [DISCUSS] Vector type and empty value

2023-09-19 Thread David Capwell
mpty. It’s too late for the existing types, but we should > hold to this going forward. Which is what I think the idea was in > https://issues.apache.org/jira/browse/CASSANDRA-8951 as well? That it was > sad the existing numerics were emptiable, but too late to change, and we > co

Re: [DISCUSS] Vector type and empty value

2023-09-20 Thread David Capwell
Days. It’s a distinct concern from columns being nullable or not. > > There are a couple types where this makes sense: strings and blobs. All else > should not allow this except for backward compatibility reasons. So, not for > new types. > >> On 20 Sep 2023, at 00:08, David Ca

Re: [DISCUSS] Vector type and empty value

2023-09-20 Thread David Capwell
nse: strings and blobs. All else >> should not allow this except for backward compatibility reasons. So, not for >> new types. >> >>>> On 20 Sep 2023, at 00:08, David Capwell wrote: >>>> >>>> When does empty mean null? >>> >>&

Re: [DISCUSS] Backport CASSANDRA-18816 to 5.0? Add support for repair coordinator to retry messages that timeout

2023-09-26 Thread David Capwell
;> I think it could be argued that not retrying messages is a bug, I am >>>> +1 on including this in 5.0. >>>> >>>> Kind Regards, >>>> Brandon >>>> >>>> On Tue, Sep 19, 2023 at 1:16 PM David Capwell >>> <mailto:dc

Re: multiple ParameterizedClass objects?

2023-10-03 Thread David Capwell
It would help me if you could give examples of what you want the yaml to look like and why it requires ParameterizedClass. I try to avoid that class as much as possible when doing configs and newer configs are finding different ways to solve the same problems... > On Oct 3, 2023, at 12:10 AM,

Re: [VOTE] Accept java-driver

2023-10-03 Thread David Capwell
+1 > On Oct 3, 2023, at 8:32 AM, Chris Lohfink wrote: > > +1 > > On Tue, Oct 3, 2023 at 10:30 AM Jeff Jirsa > wrote: >> +1 >> >> >> On Mon, Oct 2, 2023 at 9:53 PM Mick Semb Wever > > wrote: >>> The donation of the java-driver is ready for its

[DISCUSS] Gossip shutdown may corrupt peers making it so the cluster never converges, and a small protocol change to fix

2023-10-06 Thread David Capwell
Just filed https://issues.apache.org/jira/browse/CASSANDRA-18913 (Gossip NPE due to shutdown event corrupting empty statuses) which is where I saw this issue.. When we do gossip shutdown we send a message GOSSIP_SHUTDOWN which then gets handled by this method org.apache.cassandra.gms.Gossiper#m

Re: [DISCUSS] Gossip shutdown may corrupt peers making it so the cluster never converges, and a small protocol change to fix

2023-10-06 Thread David Capwell
randon Williams wrote: > > On Fri, Oct 6, 2023 at 5:50 PM David Capwell wrote: >> Lets say you now need to host replace node1 > > Won't the replacement have a newer generation? > >> avoid peers mutating endpoint states they don’t own > > This sounds reasonable to

Re: [DISCUSS] Gossip shutdown may corrupt peers making it so the cluster never converges, and a small protocol change to fix

2023-10-09 Thread David Capwell
PM, David Capwell wrote: > >> Won't the replacement have a newer generation? > > The replacement is a different instance. I performs a shadow round with its > seeds and if they are impacted by this issue then they are missing tokens, so > we fail the host replacement

Re: CASSANDRA-18775 (Cassandra supported OSs)

2023-10-20 Thread David Capwell
+1 to drop the whole lib… > On Oct 20, 2023, at 7:55 AM, Jeremiah Jordan > wrote: > > Agreed. -1 on selectively removing any of the libs. But +1 for removing the > whole thing if it is no longer used. > > -Jeremiah > > On Oct 20, 2023 at 9:28:55 AM, Mick Semb Wever

Re: [DISCUSS] Backport CASSANDRA-18816 to 5.0? Add support for repair coordinator to retry messages that timeout

2023-10-24 Thread David Capwell
retry logic > On Sep 26, 2023, at 12:08 PM, David Capwell wrote: > > Thanks all for the feedback! The patch has 2 +1s on trunk and back ported to > 5.0, making sure it’s stable now; I plan to merge early this week. > >> On Sep 21, 2023, at 2:07 PM, Ekaterina Dimitrova >

Re: [DISCUSS] Backport CASSANDRA-18816 to 5.0? Add support for repair coordinator to retry messages that timeout

2023-10-25 Thread David Capwell
IR patch is up for review https://issues.apache.org/jira/browse/CASSANDRA-18962 > On Oct 24, 2023, at 3:15 PM, David Capwell wrote: > > I sat down to add IR messages to the mix… given how positive the feedback was > for other repair messages I assume people are still ok with this

Re: [DISCUSS] Harry in-tree

2023-11-27 Thread David Capwell
+1 to in-tree > On Nov 27, 2023, at 9:17 AM, Benjamin Lerer wrote: > > +1 > > Le lun. 27 nov. 2023 à 18:01, Brandon Williams > a écrit : >> I am +1 on including Harry in-tree. >> >> Kind Regards, >> Brandon >> >> On Fri, Nov 24, 2023 at 9:44 AM Alex Petrov >

Re: [DISCUSS] CASSANDRA-19113: Publishing dtest-shaded JARs on release

2023-11-28 Thread David Capwell
+1 from me > On Nov 28, 2023, at 12:55 PM, Doug Rohrer wrote: > > +1 (nb, but not a vote, so ¯\_(ツ)_/¯ ) - would be lovely to not have to deal > with this individually for each project in which we use the in-jvm dtest > framework. As Francisco noted, we’re using this in the sidecar and Analyti

Re: Welcome Mike Adamson as Cassandra committer

2023-12-08 Thread David Capwell
Congrats! > On Dec 8, 2023, at 11:00 AM, Lorina Poland wrote: > > Congratulations, Mike!

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-12 Thread David Capwell
Overall LGTM. > On Dec 12, 2023, at 5:29 AM, Benjamin Lerer wrote: > > Hi everybody, > > I would like to open the discussion on the introduction of a cost based > optimizer to allow Cassandra to pick the best execution plan based on the > data distribution.Therefore, improving the overall q

Re: Moving Semver4j from test to main dependencies

2023-12-18 Thread David Capwell
+1 > On Dec 15, 2023, at 7:35 PM, Mick Semb Wever wrote: > > > >> I'd like to add Semver4j to the production dependencies. It is currently on >> the test classpath. The library is pretty lightweight, licensed with MIT and >> has no transitive dependencies. >> >> We need to represent the ker

Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-18 Thread David Capwell
> A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly > interesting annotation-based approach to property testing. Curious if you've > looked into or used that at all David (Capwell)? (link for the lazy: > https://jqwik.net/docs/current/user-guide.html#

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-19 Thread David Capwell
imary indexes. >>>> In general, there are plenty of use cases that prefer determinism. So I >>>> agree that there should at least be a CBO implementation that makes the >>>> same decisions as the status quo, deterministically. >>>> >>>> I d

  1   2   3   4   >