https://issues.apache.org/jira/browse/CASSANDRA-18504
> On May 5, 2023, at 12:27 PM, David Capwell <dcapw...@apple.com> wrote: > > Yep, fair point…. SPARSE VECTOR better maps to NON NULL MAP<int32, type> > >> On May 5, 2023, at 11:58 AM, David Capwell <dcapw...@apple.com> wrote: >> >>> If we ever add sparse vectors, we can assume that DENSE is the default and >>> allow to use either DENSE, SPARSE or nothing. >> >> I have been feeling that sparse is just a fixed size list with nulls… so >> array<type, dimension>… if you insert {0: 42, 3: 17} then you get a array of >> [42, null, null, 17]? One negative doing this is any operator/function that >> needs to reify large vectors (lets say 10k elements) you have a ton of >> memory due to us making it a array… so a new type could be used to lower >> this cost… >> >> With DENSE VECTOR we have the syntax in place that we “could” add SPARSE >> later… With VECTOR we will have complications adding a sparse vector after >> the fact due to this implying DENSE… >> >> Updated ranking >> >> Syntax >> Score >> VECTOR<type, dimension> >> 21 >> DENSE VECTOR<type, dimension> >> 12 >> type[dimension] >> 10 >> NON NULL <type>[dimention] >> 8 >> VECTOR type[n] >> 5 >> DENSE_VECTOR<type, dimension> >> 4 >> NON-NULL FROZEN<type[n]> >> 3 >> ARRAY<type, n> >> 1 >> >> Syntax >> Round 1 >> Round 2 >> VECTOR<type, dimension> >> 4 >> 4 >> DENSE VECTOR<type, dimension> >> 2 >> 3 >> NON NULL <type>[dimention] >> 2 >> 1 >> VECTOR type[n] >> 1 >> >> type[dimension] >> 1 >> >> DENSE_VECTOR<type, dimension> >> 1 >> >> NON-NULL FROZEN<type[n]> >> 1 >> >> ARRAY<type, n> >> 0 >> >> >> VECTOR<type, dimension> is still in the lead… >> >>> On May 5, 2023, at 11:40 AM, Andrés de la Peña <adelap...@apache.org> wrote: >>> >>> My vote is: >>> >>> 1. VECTOR<type, dimension> >>> 2. DENSE VECTOR<type, dimension> >>> 3. type[dimension] >>> >>> If we ever add sparse vectors, we can assume that DENSE is the default and >>> allow to use either DENSE, SPARSE or nothing. >>> >>> Perhaps the dimension could be separated from the type, such as in >>> VECTOR<type>[dimension] or VECTOR<type>(dimension). >>> >>> On Fri, 5 May 2023 at 19:05, David Capwell <dcapw...@apple.com >>> <mailto:dcapw...@apple.com>> wrote: >>>>>> ...where, just to be clear, VECTOR<type, dimension> means a frozen fixed >>>>>> size array w/ no null values? >>>>> Assuming this is the case >>>> >>>> The current agreed requirements are: >>>> >>>> 1) non-null elements >>>> 2) fixed length >>>> 3) frozen >>>> >>>> You pointed out 3 isn’t actually required, but that would be a different >>>> conversation to remove =)… maybe defer this to JIRA as long as all parties >>>> agree in the ticket? >>>> >>>> With all votes in, this is what I see >>>> >>>> Syntax >>>> Jonathan Ellis >>>> David Capwell >>>> Josh McKenzie >>>> Caleb Rackliffe >>>> Patrick McFadin >>>> Brandon Williams >>>> Mike Adamson >>>> Benedict >>>> Mick Semb Wever >>>> Derek Chen-Becker >>>> VECTOR<type, dimension> >>>> 1 >>>> 2 >>>> 2 >>>> >>>> 2 >>>> 1 >>>> 1 >>>> 3 >>>> 2 >>>> >>>> DENSE VECTOR<type, dimension> >>>> 2 >>>> 1 >>>> >>>> >>>> 1 >>>> >>>> 2 >>>> >>>> >>>> >>>> type[dimension] >>>> 3 >>>> 3 >>>> 3 >>>> 1 >>>> >>>> 3 >>>> >>>> 2 >>>> >>>> >>>> DENSE_VECTOR<type, dimension> >>>> >>>> >>>> 1 >>>> >>>> >>>> >>>> >>>> >>>> >>>> 3 >>>> NON NULL <type>[dimention] >>>> >>>> 1 >>>> >>>> >>>> >>>> >>>> >>>> 1 >>>> >>>> 2 >>>> VECTOR type[n] >>>> >>>> >>>> >>>> >>>> >>>> 2 >>>> >>>> >>>> 1 >>>> >>>> ARRAY<type, n> >>>> >>>> >>>> >>>> >>>> 3 >>>> >>>> >>>> >>>> >>>> >>>> NON-NULL FROZEN<type[n]> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> 1 >>>> >>>> Rank >>>> Weight >>>> 1 >>>> 3 >>>> 2 >>>> 2 >>>> 3 >>>> 1 >>>> ? >>>> 3 >>>> >>>> Syntax >>>> Score >>>> VECTOR<type, dimension> >>>> 18 >>>> DENSE VECTOR<type, dimension> >>>> 10 >>>> type[dimension] >>>> 9 >>>> NON NULL <type>[dimention] >>>> 8 >>>> VECTOR type[n] >>>> 5 >>>> DENSE_VECTOR<type, dimension> >>>> 4 >>>> NON-NULL FROZEN<type[n]> >>>> 3 >>>> ARRAY<type, n> >>>> 1 >>>> >>>> >>>> Syntax >>>> Round 1 >>>> Round 2 >>>> VECTOR<type, dimension> >>>> 3 >>>> 4 >>>> DENSE VECTOR<type, dimension> >>>> 2 >>>> 2 >>>> NON NULL <type>[dimention] >>>> 2 >>>> 1 >>>> VECTOR type[n] >>>> 1 >>>> >>>> type[dimension] >>>> 1 >>>> >>>> DENSE_VECTOR<type, dimension> >>>> 1 >>>> >>>> NON-NULL FROZEN<type[n]> >>>> 1 >>>> >>>> ARRAY<type, n> >>>> 0 >>>> >>>> >>>> Under 2 different voting systems vector<type, dimension> is in the lead >>>> and by a good amount… I have updated the patch locally to reflect this >>>> change as well. >>>> >>>>> On May 5, 2023, at 10:41 AM, Mike Adamson <madam...@datastax.com >>>>> <mailto:madam...@datastax.com>> wrote: >>>>> >>>>>> ...where, just to be clear, VECTOR<type, dimension> means a frozen fixed >>>>>> size array w/ no null values? >>>>> Assuming this is the case, my vote is: >>>>> >>>>> 1. VECTOR<type, dimension> >>>>> 2. DENSE VECTOR<type, dimension> >>>>> >>>>> I don't really have a 3rd vote because I think that type[dimension] is >>>>> too ambiguous. >>>>> >>>>> >>>>> On Fri, 5 May 2023 at 18:32, Derek Chen-Becker <de...@chen-becker.org >>>>> <mailto:de...@chen-becker.org>> wrote: >>>>>> LOL, I'm holding you to that at the summit :) In all seriousness, I'm >>>>>> glad to see a robust debate around it. I guess for completeness, my >>>>>> order of preference is >>>>>> >>>>>> 1 - NONNULL FROZEN<TYPE<N>> >>>>>> 2 - NONNULL TYPE<N> (which part of this implies frozen? The NONNULL or >>>>>> the cardinality?) >>>>>> 3 - DENSE_VECTOR<type, N> >>>>>> >>>>>> I guess my main concern with just "VECTOR" is that it's such an >>>>>> overloaded term. Maybe in ML it means something specific, but for anyone >>>>>> coming from C++, Rust, Java, etc, a Vector is both mutable and can carry >>>>>> null (or equivalent, e.g. None, in Rust). If the argument hadn't also >>>>>> been made that we should be working toward something that's not >>>>>> ML-specific maybe I would be less concerned. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Derek >>>>>> >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Derek >>>>>> >>>>>> On Fri, May 5, 2023 at 11:14 AM Patrick McFadin <pmcfa...@gmail.com >>>>>> <mailto:pmcfa...@gmail.com>> wrote: >>>>>>> Derek, despite your preference, I would hang out with you at a party. >>>>>>> >>>>>>> On Fri, May 5, 2023 at 9:44 AM Derek Chen-Becker <de...@chen-becker.org >>>>>>> <mailto:de...@chen-becker.org>> wrote: >>>>>>>> Speaking as someone who likes Erlang, maybe that's why I also like >>>>>>>> NONNULL FROZEN<TYPE<[n]>>. It's unambiguous what Cassandra is going to >>>>>>>> do with that type. DENSE VECTOR means I need to go read docs (and then >>>>>>>> probably double-check in the source to be sure) to be sure what >>>>>>>> exactly is going on. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Derek >>>>>>>> >>>>>>>> On Fri, May 5, 2023 at 9:54 AM Patrick McFadin <pmcfa...@gmail.com >>>>>>>> <mailto:pmcfa...@gmail.com>> wrote: >>>>>>>>> I hope we are willing to consider developers that use our system >>>>>>>>> because if I had to teach people to use "NON-NULL FROZEN<TYPE[n]>" >>>>>>>>> I'm pretty sure the response would be: >>>>>>>>> >>>>>>>>> Did you tell me to go write a distributed map-reduce job in Erlang? I >>>>>>>>> beleive I did, Bob. >>>>>>>>> >>>>>>>>> On Fri, May 5, 2023 at 8:05 AM Josh McKenzie <jmcken...@apache.org >>>>>>>>> <mailto:jmcken...@apache.org>> wrote: >>>>>>>>>> Idiomatically, to my mind, there's a question of "what space are we >>>>>>>>>> thinking about this datatype in"? >>>>>>>>>> >>>>>>>>>> - In the context of mathematics, nullability in a vector would be 0 >>>>>>>>>> - In the context of Cassandra, nullability tends to mean a tombstone >>>>>>>>>> (or nothing) >>>>>>>>>> - In the context of programming languages, it's all over the place >>>>>>>>>> >>>>>>>>>> Given many models are exploring quantizing to int8 and other data >>>>>>>>>> types, there's definitely the "support other data types easily in >>>>>>>>>> the future" piece to me we need to keep in mind. >>>>>>>>>> >>>>>>>>>> So with the above and the "meet the user where they are and don't >>>>>>>>>> make them understand more of Cassandra than absolutely critical to >>>>>>>>>> use it", I lean: >>>>>>>>>> >>>>>>>>>> 1. DENSE_VECTOR<type, dimension> >>>>>>>>>> 2. VECTOR<type, dimension> >>>>>>>>>> 3. type[dimension] >>>>>>>>>> >>>>>>>>>> This leaves the path open for us to expand on it in the future with >>>>>>>>>> sparse support and allows us to introduce some semantics that >>>>>>>>>> indicate idioms around nullability for the users coming from a >>>>>>>>>> different space. >>>>>>>>>> >>>>>>>>>> "NON-NULL FROZEN<TYPE[n]>" is strictly correct, however it requires >>>>>>>>>> understanding idioms of how Cassandra thinks about data (nulls mean >>>>>>>>>> different things to us, we have differences between frozen and >>>>>>>>>> non-frozen due to constraints in our storage engine and >>>>>>>>>> materialization of data, etc) that get in the way of users doing >>>>>>>>>> things in the pattern they're familiar with without learning more >>>>>>>>>> about the DB than they're probably looking to learn. Historically >>>>>>>>>> this has been a challenge for us in adoption; the classic "Why can't >>>>>>>>>> I just write and delete and write as much as I want? Why are deletes >>>>>>>>>> filling up my disk?" problem comes to mind. >>>>>>>>>> >>>>>>>>>> I'd also be happy with us supporting: >>>>>>>>>> * NON-NULL FROZEN<TYPE[n]> >>>>>>>>>> * DENSE_VECTOR<type, dimension> as syntactic sugar for the above >>>>>>>>>> >>>>>>>>>> If getting into the "built-in syntactic sugar mapping for >>>>>>>>>> communities and specific use-cases" is something we're willing to >>>>>>>>>> consider. >>>>>>>>>> >>>>>>>>>> On Fri, May 5, 2023, at 7:26 AM, Patrick McFadin wrote: >>>>>>>>>>> I think we are still discussing implementation here when I'm >>>>>>>>>>> talking about developer experience. I want developers to adopt this >>>>>>>>>>> quickly, easily and be successful. Vector search is already a >>>>>>>>>>> thing. People use it every day. A successful outcome, in my view, >>>>>>>>>>> is developers picking up this feature without reading a manual. >>>>>>>>>>> (Because they don't anyway and get in trouble) I did some more >>>>>>>>>>> extensive research about what other DBs are using for syntax. The >>>>>>>>>>> consensus is some variety of 'VECTOR', 'DENSE' and 'SPARSE' >>>>>>>>>>> >>>>>>>>>>> Pinecone[1] - dense_vector, sparse_vector >>>>>>>>>>> Elastic[2]: dense_vector >>>>>>>>>>> Milvus[3]: float_vector, binary_vector >>>>>>>>>>> pgvector[4]: vector >>>>>>>>>>> Weaviate[5]: Different approach. All typed arrays can be indexed >>>>>>>>>>> >>>>>>>>>>> Based on that I'm advocating a similar syntax: >>>>>>>>>>> >>>>>>>>>>> - DENSE VECTOR >>>>>>>>>>> or >>>>>>>>>>> - VECTOR >>>>>>>>>>> >>>>>>>>>>> [1] https://docs.pinecone.io/docs/hybrid-search >>>>>>>>>>> <https://urldefense.com/v3/__https://docs.pinecone.io/docs/hybrid-search__;!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7nGOa1KY4$> >>>>>>>>>>> [2] >>>>>>>>>>> https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html >>>>>>>>>>> >>>>>>>>>>> <https://urldefense.com/v3/__https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html__;!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7n--HiUaw$> >>>>>>>>>>> [3] https://milvus.io/docs/create_collection.md >>>>>>>>>>> <https://urldefense.com/v3/__https://milvus.io/docs/create_collection.md__;!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7nQttAKvY$> >>>>>>>>>>> [4] https://github.com/pgvector/pgvector >>>>>>>>>>> [5] https://weaviate.io/developers/weaviate/config-refs/datatypes >>>>>>>>>>> <https://urldefense.com/v3/__https://weaviate.io/developers/weaviate/config-refs/datatypes__;!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7n0yKoHLs$> >>>>>>>>>>> >>>>>>>>>>> On Fri, May 5, 2023 at 6:07 AM Mike Adamson <madam...@datastax.com >>>>>>>>>>> <mailto:madam...@datastax.com>> wrote: >>>>>>>>>>> Then we can have the indexing apparatus only accept >>>>>>>>>>> frozen<float[n]> for the HSNW case. >>>>>>>>>>> I'm inclined to agree with Benedict that the index will need to be >>>>>>>>>>> specifically select by option rather than inferred based on type. >>>>>>>>>>> As such there is no real reason for the frozen requirement on the >>>>>>>>>>> type. The hnsw index can be built just as easily from a non-frozen >>>>>>>>>>> array. >>>>>>>>>>> >>>>>>>>>>> I am in favour of enforcing non-null on the elements of an array by >>>>>>>>>>> default. I would prefer that allowing nulls in the array would be a >>>>>>>>>>> later addition if and when a use case arose for it. >>>>>>>>>>> >>>>>>>>>>> On Fri, 5 May 2023 at 03:02, Caleb Rackliffe >>>>>>>>>>> <calebrackli...@gmail.com <mailto:calebrackli...@gmail.com>> wrote: >>>>>>>>>>> Even in the ML case, sparse can just mean zeros rather than nulls, >>>>>>>>>>> and they should compress similarly anyway. >>>>>>>>>>> >>>>>>>>>>> If we really want null values, I'd rather leave that in collections >>>>>>>>>>> space. >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:59 PM Caleb Rackliffe >>>>>>>>>>> <calebrackli...@gmail.com <mailto:calebrackli...@gmail.com>> wrote: >>>>>>>>>>> I actually still prefer type[dimension], because I think I >>>>>>>>>>> intuitively read this as a primitive (meaning no null elements) >>>>>>>>>>> array. Then we can have the indexing apparatus only accept >>>>>>>>>>> frozen<float[n]> for the HSNW case. >>>>>>>>>>> >>>>>>>>>>> If that isn't intuitive to anyone else, I don't really have a >>>>>>>>>>> strong opinion...but...conflating "frozen" and "dense" seems like a >>>>>>>>>>> bad idea. One should indicate single vs. multi-cell, and the other >>>>>>>>>>> the presence or absence of nulls/zeros/whatever. >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 12:51 PM Patrick McFadin <pmcfa...@gmail.com >>>>>>>>>>> <mailto:pmcfa...@gmail.com>> wrote: >>>>>>>>>>> I agree with David's reasoning and the use of DENSE (and maybe >>>>>>>>>>> eventually SPARSE). This is terminology well established in the >>>>>>>>>>> data world, and it would lead to much easier adoption from users. >>>>>>>>>>> VECTOR is close, but I can see having to create a lot of content >>>>>>>>>>> around "How to use it and not get in trouble." (I have a lot of >>>>>>>>>>> that content already) >>>>>>>>>>> >>>>>>>>>>> - We don't have to explain what it is. A lot of prior art out >>>>>>>>>>> there already [1][2][3] >>>>>>>>>>> - We're matching an established term with what users would expect. >>>>>>>>>>> No surprises. >>>>>>>>>>> - Shorter ramp-up time for users. Cassandra is being modernized. >>>>>>>>>>> >>>>>>>>>>> The implementation is flexible, but the interface should empower >>>>>>>>>>> our users to be awesome. >>>>>>>>>>> >>>>>>>>>>> Patrick >>>>>>>>>>> >>>>>>>>>>> 1 - >>>>>>>>>>> https://stats.stackexchange.com/questions/266996/what-do-the-terms-dense-and-sparse-mean-in-the-context-of-neural-networks >>>>>>>>>>> >>>>>>>>>>> <https://urldefense.com/v3/__https://stats.stackexchange.com/questions/266996/what-do-the-terms-dense-and-sparse-mean-in-the-context-of-neural-networks__;!!PbtH5S7Ebw!dpAaXazB6qZfr_FdkU9ThEq4X0DDTa-DlNvF5V4AvTiZSpHeYn6zqhFD4ZVaRLYoQBmNTn7n6jt5ymZs5Ud6ieKGQw$> >>>>>>>>>>> 2 - >>>>>>>>>>> https://induraj2020.medium.com/what-are-sparse-features-and-dense-features-8d1746a77035 >>>>>>>>>>> >>>>>>>>>>> <https://urldefense.com/v3/__https://induraj2020.medium.com/what-are-sparse-features-and-dense-features-8d1746a77035__;!!PbtH5S7Ebw!dpAaXazB6qZfr_FdkU9ThEq4X0DDTa-DlNvF5V4AvTiZSpHeYn6zqhFD4ZVaRLYoQBmNTn7n6jt5ymZs5Ue1o2CO2Q$> >>>>>>>>>>> 3 - >>>>>>>>>>> https://revware.net/sparse-vs-dense-data-the-power-of-points-and-clouds/ >>>>>>>>>>> >>>>>>>>>>> <https://urldefense.com/v3/__https://revware.net/sparse-vs-dense-data-the-power-of-points-and-clouds/__;!!PbtH5S7Ebw!dpAaXazB6qZfr_FdkU9ThEq4X0DDTa-DlNvF5V4AvTiZSpHeYn6zqhFD4ZVaRLYoQBmNTn7n6jt5ymZs5Ud3U6Hw5A$> >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 10:25 AM David Capwell <dcapw...@apple.com >>>>>>>>>>> <mailto:dcapw...@apple.com>> wrote: >>>>>>>>>>> My views have changed over time on syntax and I feel >>>>>>>>>>> type[dimention] may not be the best, so it has gone lower in my own >>>>>>>>>>> personal ranking… this is my current preference >>>>>>>>>>> >>>>>>>>>>> 1) DENSE <type>[dimention] | NON NULL <type>[dimention] >>>>>>>>>>> 2) VECTOR<type, dimention> >>>>>>>>>>> 3) type[dimention] >>>>>>>>>>> >>>>>>>>>>> My reasoning for this order >>>>>>>>>>> >>>>>>>>>>> * type[dimention] looks like syntax sugar for array<type, >>>>>>>>>>> dimention>, so users may assume list/array semantics, but we limit >>>>>>>>>>> to non-null elements in a frozen array >>>>>>>>>>> * feel VECTOR as a prefix feels out of place, but VECTOR as a >>>>>>>>>>> direct type makes more sense… this also leads to a possible future >>>>>>>>>>> of VECTOR<type> which is the non-fixed length version of this type. >>>>>>>>>>> What makes VECTOR different from list/array? non-null elements >>>>>>>>>>> and is frozen. I don’t feel that VECTOR really tells users to >>>>>>>>>>> expect non-null or frozen semantics, as there exists different >>>>>>>>>>> VECTOR types for those reasons (sparse vs dense)… >>>>>>>>>>> * DENSE may be confusing for people coming from languages where >>>>>>>>>>> this just means “sequential layout”, which is what our frozen >>>>>>>>>>> array/list already are… but since the target user is coming from a >>>>>>>>>>> ML background, this shouldn’t offer much confusion. DENSE just >>>>>>>>>>> means FROZEN in Cassandra, with NON NULL elements (SPARSE allows >>>>>>>>>>> for NULL and isn’t frozen)… So DENSE just acts as syntax sugar for >>>>>>>>>>> frozen<non null type[dimention]> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On May 4, 2023, at 4:13 AM, Brandon Williams <dri...@gmail.com >>>>>>>>>>>> <mailto:dri...@gmail.com>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> 1. VECTOR<FLOAT,n> >>>>>>>>>>>> 2. VECTOR FLOAT[n] >>>>>>>>>>>> 3. FLOAT[N] (Non null by default) >>>>>>>>>>>> >>>>>>>>>>>> Redundant or not, I think having the VECTOR keyword helps signify >>>>>>>>>>>> what >>>>>>>>>>>> the app is generally about and helps get buy-in from ML >>>>>>>>>>>> stakeholders. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, May 4, 2023 at 3:45 AM Benedict <bened...@apache.org >>>>>>>>>>>> <mailto:bened...@apache.org>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hurrah for initial agreement. >>>>>>>>>>>>> >>>>>>>>>>>>> For syntax, I think one option was just FLOAT[N]. In VECTOR >>>>>>>>>>>>> FLOAT[N], VECTOR is redundant - FLOAT[N] is fully descriptive by >>>>>>>>>>>>> itself. I don’t think VECTOR should be used to simply imply >>>>>>>>>>>>> non-null, as this would be very unintuitive. More logical would >>>>>>>>>>>>> be NONNULL, if this is the only condition being applied. >>>>>>>>>>>>> Alternatively for arrays we could default to NONNULL and later >>>>>>>>>>>>> introduce NULLABLE if we want to permit nulls. >>>>>>>>>>>>> >>>>>>>>>>>>> If the word vector is to be used it makes more sense to make it >>>>>>>>>>>>> look like a list, so VECTOR<FLOAT, N> as here the word VECTOR is >>>>>>>>>>>>> clearly not redundant. >>>>>>>>>>>>> >>>>>>>>>>>>> So, I vote: >>>>>>>>>>>>> >>>>>>>>>>>>> 1) (NON NULL) FLOAT[N] >>>>>>>>>>>>> 2) FLOAT[N] (Non null by default) >>>>>>>>>>>>> 3) VECTOR<FLOAT, N> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 4 May 2023, at 08:52, Mick Semb Wever <m...@apache.org >>>>>>>>>>>>> <mailto:m...@apache.org>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Did we agree on a CQL syntax? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don’t believe there has been a pool on CQL syntax… my >>>>>>>>>>>>>> understanding reading all the threads is that there are ~4-5 >>>>>>>>>>>>>> options and non are -1ed, so believe we are waiting for majority >>>>>>>>>>>>>> rule on this? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Re-reading that thread, IIUC the valid choices remaining are… >>>>>>>>>>>>> >>>>>>>>>>>>> 1. VECTOR FLOAT[n] >>>>>>>>>>>>> 2. FLOAT VECTOR[n] >>>>>>>>>>>>> 3. VECTOR<FLOAT,n> >>>>>>>>>>>>> 4. VECTOR[n]<FLOAT> >>>>>>>>>>>>> 5. ARRAY<FLOAT, n> >>>>>>>>>>>>> 6. NON-NULL FROZEN<FLOAT[n]> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yes I'm putting my preference (1) first ;) because (banging on) >>>>>>>>>>>>> if the future of CQL will have FLOAT[n] and FROZEN<FLOAT[n]>, >>>>>>>>>>>>> where the VECTOR keyword is: for general cql users; just meaning >>>>>>>>>>>>> "non-null and frozen", these gel best together. >>>>>>>>>>>>> >>>>>>>>>>>>> Options (5) and (6) are for those that feel we can and should >>>>>>>>>>>>> provide this type without introducing the vector keyword. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> <https://www.datastax.com/> >>>>>>>>>>> Mike Adamson >>>>>>>>>>> Engineering >>>>>>>>>>> +1 650 389 6000 <tel:16503896000> | datastax.com >>>>>>>>>>> <https://www.datastax.com/> >>>>>>>>>>> Find DataStax Online: >>>>>>>>>>> >>>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q&e=> >>>>>>>>>>> >>>>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU&e=> >>>>>>>>>>> <https://twitter.com/DataStax> >>>>>>>>>>> <https://www.datastax.com/blog/rss.xml> >>>>>>>>>>> <https://github.com/datastax> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> +---------------------------------------------------------------+ >>>>>>>> | Derek Chen-Becker | >>>>>>>> | GPG Key available at https://keybase.io/dchenbecker >>>>>>>> <https://urldefense.com/v3/__https://keybase.io/dchenbecker__;!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7nLBpa-Vg$> >>>>>>>> and | >>>>>>>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org >>>>>>>> <https://urldefense.com/v3/__https://pgp.mit.edu/pks/lookup?search=derek*40chen-becker.org__;JQ!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7nkqpt2mA$> >>>>>>>> | >>>>>>>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >>>>>>>> +---------------------------------------------------------------+ >>>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> +---------------------------------------------------------------+ >>>>>> | Derek Chen-Becker | >>>>>> | GPG Key available at https://keybase.io/dchenbecker >>>>>> <https://urldefense.com/v3/__https://keybase.io/dchenbecker__;!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7nLBpa-Vg$> >>>>>> and | >>>>>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org >>>>>> <https://urldefense.com/v3/__https://pgp.mit.edu/pks/lookup?search=derek*40chen-becker.org__;JQ!!PbtH5S7Ebw!epFk5syZ_avANqrEkFR0WT7Alkybo0yrvO-_awqqn8mVWpnyuSgAm0FMgbE_rYpSWJSC91KmoX7nkqpt2mA$> >>>>>> | >>>>>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | >>>>>> +---------------------------------------------------------------+ >>>>>> >>>>> >>>>> >>>>> -- >>>>> <https://www.datastax.com/> Mike Adamson >>>>> Engineering >>>>> >>>>> +1 650 389 6000 <tel:16503896000> | datastax.com >>>>> <https://www.datastax.com/> >>>>> Find DataStax Online: >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_company_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=akx0E6l2bnTjOvA-YxtonbW0M4b6bNg4nRwmcHNDo4Q&e=> >>>>> >>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.facebook.com_datastax&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=uHzE4WhPViSF0rsjSxKhfwGDU1Bo7USObSc_aIcgelo&s=ncMlB41-6hHuqx-EhnM83-KVtjMegQ9c2l2zDzHAxiU&e=> >>>>> <https://twitter.com/DataStax> >>>>> <https://www.datastax.com/blog/rss.xml> <https://github.com/datastax> >>>>> >>>> >> >