The vote has passed.
From: Tommy Stendahl via dev
Sent: Thursday, May 4, 2023 10:27
To: dev@cassandra.apache.org
Subject: Re: [VOTE] Release Apache Cassandra 3.11.15
NetApp Security WARNING: This is an external email. Do not click links or open
attachmen
+1
> On 4 May 2023, at 17:46, Doug Rohrer wrote:
>
> Hello all,
>
> I’d like to put CEP-28 to a vote.
>
> Proposal:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-28%3A+Reading+and+Writing+Cassandra+Data+with+Spark+Bulk+Analytics
>
> Jira:
> https://issues.apache.org/jira/brow
+1
> On 4 May 2023, at 17:46, Doug Rohrer wrote:
>
> Hello all,
>
> I’d like to put CEP-28 to a vote.
>
> Proposal:
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-28%3A+Reading+and+Writing+Cassandra+Data+with+Spark+Bulk+Analytics
>
> Jira:
> https://issues.apache.org/jira/brow
The Cassandra team is pleased to announce the release of Apache Cassandra
version 3.11.15.
Apache Cassandra is a fully distributed database. It is the right choice when
you need scalability and high availability without compromising performance.
http://cassandra.apache.org/
Downloads of sourc
The test build of Cassandra 3.0.29 is available.
sha1: 087cffce636b63c12e328994d52bdf8f4ccc9750
Git: https://github.com/apache/cassandra/tree/3.0.29-tentative
Maven Artifacts:
https://repository.apache.org/content/repositories/orgapachecassandra-1288/org/apache/cassandra/cassandra-all/3.0.29/
Th
>
> Then we can have the indexing apparatus only accept *frozen* for
> the HSNW case.
>
I'm inclined to agree with Benedict that the index will need to be
specifically select by option rather than inferred based on type. As such
there is no real reason for the *frozen* requirement on the type. The
+1
We at zeotap did something similar with Scylla DB and Janusgraph for Spark
graph OLAP use cases, this is truly transformative, C* HTAP
On Fri, May 5, 2023 at 4:15 PM Sam Tunnicliffe wrote:
> +1
>
> On 4 May 2023, at 17:46, Doug Rohrer wrote:
>
> Hello all,
>
> I’d like to put CEP-28 to a vo
I think we are still discussing implementation here when I'm talking about
developer experience. I want developers to adopt this quickly, easily and
be successful. Vector search is already a thing. People use it every day. A
successful outcome, in my view, is developers picking up this feature
with
Idiomatically, to my mind, there's a question of "what space are we thinking
about this datatype in"?
- In the context of mathematics, nullability in a vector would be 0
- In the context of Cassandra, nullability tends to mean a tombstone (or
nothing)
- In the context of programming languages, i
I hope we are willing to consider developers that use our system because if
I had to teach people to use "NON-NULL FROZEN" I'm pretty sure the
response would be:
Did you tell me to go write a distributed map-reduce job in Erlang? I
beleive I did, Bob.
On Fri, May 5, 2023 at 8:05 AM Josh McKenzie
> The hnsw index can be built just as easily from a non-frozen array.
I have 0 issues removing that limitation =)
> I am in favour of enforcing non-null on the elements of an array by default.
This is why I feel DENSE or NON NULL are the best prefix, as those both imply
elements may not be null
+10 for not inflicting unwieldy keywords on ML users.
Re Josh's summary, mostly agreed, my only objection to adding the DENSE
keyword is that I don't see a foreseeable future where we also support
sparse vectors, so it would end up being unnecessary extra verbosity. So
my preference would be
1.
...where, just to be clear, VECTOR means a frozen fixed
size array w/ no null values?
On Fri, May 5, 2023 at 11:23 AM Jonathan Ellis wrote:
> +10 for not inflicting unwieldy keywords on ML users.
>
> Re Josh's summary, mostly agreed, my only objection to adding the DENSE
> keyword is that I don'
Went through and created a spreed sheet of current votes… For Patric and Mike,
I don’t see a clear vote, so I put a ? where I “think” your preference is… for
Mick, I only put one vote as the list looked like a summary, but you mentioned
the first was your preference
Syntax
Jonathan Ellis
David
Speaking as someone who likes Erlang, maybe that's why I also like NONNULL
FROZEN>. It's unambiguous what Cassandra is going to do with that
type. DENSE VECTOR means I need to go read docs (and then probably
double-check in the source to be sure) to be sure what exactly is going on.
Cheers,
Derek
On Fri, 5 May 2023 at 18:43, David Capwell wrote:
> Went through and created a spreed sheet of current votes… For Patric and
> Mike, I don’t see a clear vote, so I put a ? where I “think” your
> preference is… for Mick, I only put one vote as the list looked like a
> summary, but you mentioned th
Updated
Syntax
Jonathan Ellis
David Capwell
Josh McKenzie
Caleb Rackliffe
Patrick McFadin
Brandon Williams
Mike Adamson
Benedict
Mick Semb Wever
Derek Chen-Becker
VECTOR
1
2
2
1
?
3
2
DENSE VECTOR
2
1
?
?
type[dimension]
3
3
3
1
3
2
DENSE_VECTOR
1
NON NULL [dimention]
1
Sorry, DENSE_VECTOR was pointing to the wrong row, updated score
Syntax
Score
VECTOR
16
DENSE VECTOR
11
type[dimension]
9
NON NULL [dimention]
6
VECTOR type[n]
5
DENSE_VECTOR
3
NON-NULL FROZEN
3
ARRAY
0
> On May 5, 2023, at 10:01 AM, David Capwell wrote:
>
> Updated
>
> Syntax
> Jonathan Ellis
My vote is:
1. DENSE VECTOR
2. VECTOR
3. ARRAY
On Fri, May 5, 2023 at 9:43 AM David Capwell wrote:
> Went through and created a spreed sheet of current votes… For Patric and
> Mike, I don’t see a clear vote, so I put a ? where I “think” your
> preference is… for Mick, I only put one vote as the
Derek, despite your preference, I would hang out with you at a party.
On Fri, May 5, 2023 at 9:44 AM Derek Chen-Becker
wrote:
> Speaking as someone who likes Erlang, maybe that's why I also like NONNULL
> FROZEN>. It's unambiguous what Cassandra is going to do with that
> type. DENSE VECTOR mean
LOL, I'm holding you to that at the summit :) In all seriousness, I'm glad
to see a robust debate around it. I guess for completeness, my order of
preference is
1 - NONNULL FROZEN>
2 - NONNULL TYPE (which part of this implies frozen? The NONNULL or the
cardinality?)
3 - DENSE_VECTOR
I guess my ma
>
> ...where, just to be clear, VECTOR means a frozen fixed
> size array w/ no null values?
>
Assuming this is the case, my vote is:
1. VECTOR
2. DENSE VECTOR
I don't really have a 3rd vote because I think that *type[dimension]* is
too ambiguous.
On Fri, 5 May 2023 at 18:32, Derek Chen-Becker
>> ...where, just to be clear, VECTOR means a frozen fixed
>> size array w/ no null values?
> Assuming this is the case
The current agreed requirements are:
1) non-null elements
2) fixed length
3) frozen
You pointed out 3 isn’t actually required, but that would be a different
conversation to
My vote is:
1. VECTOR
2. DENSE VECTOR
3. type[dimension]
If we ever add sparse vectors, we can assume that DENSE is the default and
allow to use either DENSE, SPARSE or nothing.
Perhaps the dimension could be separated from the type, such as in
VECTOR[dimension] or VECTOR(dimension).
On Fri, 5
> If we ever add sparse vectors, we can assume that DENSE is the default and
> allow to use either DENSE, SPARSE or nothing.
I have been feeling that sparse is just a fixed size list with nulls… so
array… if you insert {0: 42, 3: 17} then you get a array of
[42, null, null, 17]? One negative d
Sparse vector in ML has the semantics that elements not explicitly set are
zero. I believe most (all?) sparse vector implementations use a map under
the hood; the point is to save a lot of space when you have 10K zeros and
100 that are nonzero.
On Fri, May 5, 2023 at 2:00 PM David Capwell wrote:
Yep, fair point…. SPARSE VECTOR better maps to NON NULL MAP
> On May 5, 2023, at 11:58 AM, David Capwell wrote:
>
>> If we ever add sparse vectors, we can assume that DENSE is the default and
>> allow to use either DENSE, SPARSE or nothing.
>
> I have been feeling that sparse is just a fixed s
https://issues.apache.org/jira/browse/CASSANDRA-18504
> On May 5, 2023, at 12:27 PM, David Capwell wrote:
>
> Yep, fair point…. SPARSE VECTOR better maps to NON NULL MAP
>
>> On May 5, 2023, at 11:58 AM, David Capwell wrote:
>>
>>> If we ever add sparse vectors, we can assume that DENSE is t
Love it. Thank you folks for coming to a decision on this. This is very
helpful to move forward on planning on for the current Python frameworks:
- Langchain.CassandraVectorStore
- Langchain.CassandraVectorRetriever
- Langchain.CassandraVectorStoreAgent
- LlamaIndex.CassandraVectorLoad
29 matches
Mail list logo