By my superficial reading I get the impression that the main distinction is
that vectors don't need to support random access into a single
element/float. I haven't looked at what Jonathan is doing, but I assume,
and it seems Jonathan assumes or knows that this makes implementation both
easier and a
>
> So is the goal here to provide something specific and idiomatic for the ML
> community or is the goal to make a primitive that's C*-centric that then
> another layer can write to? I personally argue for the former; I don't see
> this specific data type going away any time soon.
+1 on this con
I and others have claimed that an array concept will work, since it is isomorphic with a vector. I have seen the following counterclaims:1. Vectors don’t need to support index lookups2. Vectors don’t need to support ordered indexes3. Vectors don’t need to support other types besides floatNone of th
Benedict, I don't quite see why that matters? The argument is merely that
this kind of vector, for this use case, a) is different from arrays, and b)
arrays apparently don't serve the use case well enough (or at all).
Now, if from the above it follows a discussion that a vector type cannot be
a fi
pgvector is a plug-in. If you were proposing a plug-in you could ignore these considerations.On 28 Apr 2023, at 16:58, Jonathan Ellis wrote:I'm proposing a vector data type for ML use cases. It's not the same thing as an array or a list and it's not supposed to be.While it's true that it would b
I'm proposing a vector data type for ML use cases. It's not the same thing
as an array or a list and it's not supposed to be.
While it's true that it would be possible to build a vector type on top of
an array type, it's not necessary to do it that way, and given the lack of
interest in an array
But you’re proposing introducing a general purpose type - this isn’t an ML plug-in, it’s modifying the core language in a manner that makes targeting your workload easier. Which is fine, but that means you have to consider its impact on the general language, not just your target use case.On 28 Apr
That's exactly right.
In particular it makes no sense at all from an ML perspective to have
vector types of anything other than numerics. And as I mentioned in the
POC thread (but I did not mention here), float is overwhelmingly the most
frequently used vector type, to the point that Pinecone (by
The test build of Cassandra 3.11.15 is available.
sha1: 6cdcf5e56a77cf40c251125d68856a614eccbc53
Git:
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.11.15-tentative
Maven Artifacts:
https://repository.apache.org/content/repositories/orgapachecassandra-1287/org/apach
> It's easy for an inverted index to find matches efficiently, but not so easy
> for it to find non-matches.
Yes, I agree, it is not easy for an *index* to do that.
But I think at least in SAI we could do that by using the index to
find the matches, and, because they are always returned in the ro
This feature may be targeting ML users but it isn’t part of some “ML plug-in” it’s a general purpose type available to all users that happens to permit the use of ANN. So it needs to make sense in a general context, not just to ML users.I also doubt users will struggle with understanding an array o
11 matches
Mail list logo