goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2461155781
Quick Update
1. Changes to use native dot-product for both `int7` scalar-quantized and
raw vectors have been integrated in `core/src/java21`
2. `Lucene99ScalarQuantizedVect
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1828574298
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1828579554
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1828577276
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1828576127
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1828574298
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1828364953
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
msokolov commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1825833969
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -291,25 +296,125 @@ private float squareDistanceBody(float[] a,
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819967252
##
lucene/native/src/c/dotProduct.c:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819967252
##
lucene/native/src/c/dotProduct.c:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819967252
##
lucene/native/src/c/dotProduct.c:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819967252
##
lucene/native/src/c/dotProduct.c:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819887022
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentScalarQuantizedVectorScorer.java:
##
@@ -0,0 +1,407 @@
+/*
+ * Licensed to the
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819699896
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorer.java:
##
@@ -103,6 +125,27 @@ public float score(int node) thr
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819696009
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentScalarQuantizedVectorScorer.java:
##
@@ -0,0 +1,407 @@
+/*
Review Comment:
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819694503
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorer.java:
##
@@ -34,6 +37,8 @@ abstract sealed class Lucene99Memor
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1819694503
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentByteVectorScorer.java:
##
@@ -34,6 +37,8 @@ abstract sealed class Lucene99Memor
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817385236
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/VectorUtilBenchmark.java:
##
@@ -84,6 +91,76 @@ public void init() {
floatsA[i] = random.nextFl
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817415010
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/VectorUtilBenchmark.java:
##
@@ -84,6 +91,76 @@ public void init() {
floatsA[i] = random.nextFl
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817385236
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/VectorUtilBenchmark.java:
##
@@ -84,6 +91,76 @@ public void init() {
floatsA[i] = random.nextFl
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1817245059
##
lucene/core/src/java/org/apache/lucene/codecs/lucene99/OffHeapQuantizedByteVectorValues.java:
##
@@ -146,6 +146,7 @@ public float getScoreCorrectionConstant(int tar
yugushihuang commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2435780436
We have measured performance using
[knnPerfTest.py](https://github.com/mikemccand/luceneutil/blob/main/src/python/knnPerfTest.py)
in lucene util with this PR
[commit](https://githu
mikemccand commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1814796588
##
lucene/native/src/c/dotProduct.c:
##
@@ -0,0 +1,209 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreeme
mikemccand commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2435099386
@goankur -- thank you for pulling out the actual native code into a new
`native` Lucene module. I'm not sure we need a new module -- could we use
`misc` or `sandbox` maybe?
I
mikemccand commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1814797680
##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/Lucene99MemorySegmentScalarQuantizedVectorScorer.java:
##
@@ -0,0 +1,407 @@
+/*
+ * Licensed to t
msokolov commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1813272520
##
lucene/benchmark-jmh/src/java/org/apache/lucene/benchmark/jmh/VectorUtilBenchmark.java:
##
@@ -84,6 +91,76 @@ public void init() {
floatsA[i] = random.nextF
uschindler commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2350900783
> > Anyways: At moment we do not want to have native code in Lucene Core.
>
> +1, we don't put native code in Lucene's `core`.
>
> But @uschindler is there maybe a way for
uschindler commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1759686028
##
gradle/java/javac.gradle:
##
@@ -24,7 +24,11 @@ allprojects { project ->
// Use 'release' flag instead of 'source' and 'target'
tasks.withType(JavaCo
ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2349442717
> > Anyways: At moment we do not want to have native code in Lucene Core.
> ..
> Having the likes of OpenSearch, Elasticsearch, and Solr implement their
own (high performance)
mikemccand commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2343637892
> Anyways: At moment we do not want to have native code in Lucene Core.
+1, we don't put native code in Lucene's `core`.
But @uschindler is there maybe a way forward not u
uschindler commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2333805272
Hi the whole setup of the calls to native code are not correct. In Lucene we
don't use or need "--enable-preview", because we have a special way to compile
the code ("you added it mar
mikemccand commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2332099352
I'm trying to understand the status of this PR... so far it's a standalone
JMH benchy that shows that using [FFM](https://openjdk.org/jeps/454) to invoke
our own native C implementati
github-actions[bot] commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2323085616
This PR has not had activity in the past 2 weeks, labeling it as stale. If
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you
for your contributi
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1720493763
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2294502514
> also, would be good to compare apples-to-apples here. currently from what
i see, benchmark compares `dot8s(MemorySegment..)` vs
`BinaryDotProduct(byte[])`. To me this mixes up concerns
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2289745259
sorry @goankur i'm super-behind, vacations and end of summer. i definitely
want to try out the VNNI path but just haven't found the cycles yet.
--
This is an automated message from the A
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2266345858
@rmuir - I am going to be out for the next week so please feel free to play
with it and further refine the code.
- Once we get some performance numbers from adding `x86 intrinsi
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2266340565
> java --enable-native-access=ALL-UNNAMED \
>--enable-preview \
>-Djava.library.path="./lucene/core/build/libs/dotProduct/shared" \
>-jar
lucene/benchmark
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2265154048
That is the fault of your graviton setup though, you can see it right in
your previous `platforms` logic where it is invoking `cCompiler.executable
'gcc10-cc'`.
Fix GCC to be in pat
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2264420154
> But I think it makes the build more straightforward: it builds native as
you expect, if you want to use different compiler set `CC` env vars etc
differently.
I still get compila
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701126777
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701104686
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701088228
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701099939
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701088228
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701071456
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701070880
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1701070122
##
lucene/core/src/c/dotProduct.h:
##
@@ -0,0 +1,4 @@
+
+int32_t vdot8s_sve(int8_t* vec1[], int8_t* vec2, int32_t limit);
+int32_t vdot8s_neon(int8_t* vec1[], int8_t*
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2262067254
Hey Robert — Apologies for the delay and thanks for iterating with me on
this one. I will incorporate your feedback and update this PR by Aug 1st, 2024.
--
This is an automated message
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2261898654
also, would be good to compare apples-to-apples here. currently from what i
see, benchmark compares `dot8s(MemorySegment..)` vs `BinaryDotProduct(byte[])`.
To me this mixes up concerns abo
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2261892635
TODO: need to examine avx256 difference of auto-vectorized C with vs java
vector api for the integers here. This isn't nearly as bad as the ARM case
(where we understand many of the underl
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2261866107
Attached is a patch to get x86 support working.
It makes some changes to the build: specifically the java code statically
picks the best MethodHandle (SVE, Neon, Generic), and its ab
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1699270069
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1699262883
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2261746928
go @goankur, awesome progress here. It is clear we gotta do something :) I
left comments just to try to help. Do you mind avoiding rebase for updates? I
am going to take a stab at the x86
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1699259519
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1697474113
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1697469288
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1697462600
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1697455240
##
lucene/core/src/c/dotProduct.h:
##
@@ -0,0 +1,4 @@
+
+int32_t vdot8s_sve(int8_t* vec1[], int8_t* vec2, int32_t limit);
+int32_t vdot8s_neon(int8_t* vec1[], int8_t* ve
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1697452900
##
lucene/core/src/c/dotProduct.c:
##
@@ -0,0 +1,143 @@
+// dotProduct.c
+
+#include
+#include
+
+#ifdef __ARM_ACLE
+#include
+#endif
+
+#if (defined(__ARM_FEATURE_SV
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2240878772
> I definitely want to play around more with @goankur 's PR here and see
what performance looks like across machines, but will be out of town for a bit.
>
> There is a script to ru
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1685214866
##
lucene/core/build.gradle:
##
@@ -14,12 +14,59 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+plug
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2240860811
> > > With the updated compile flags, the performance of auto-vectorized
code is slightly better than explicitly vectorized code (see results).
Interesting thing to note is that both C-b
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1685207476
##
lucene/core/build.gradle:
##
@@ -14,12 +14,59 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+plug
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1685196099
##
lucene/core/build.gradle:
##
@@ -14,12 +14,59 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+pl
msokolov commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2239023786
> An alternative option to putting this in core, is to put it in say misc,
allowing users creating KnnVectorsFormat to hook into it through the
Lucene99FlatVectorsFormat and FlatVectors
ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2238753829
Just to expand a little on a previous comment I made above.
> Could Lucene ever have this directly in one of its modules?
An alternative option to putting this in `core
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236829406
I definitely want to play around more with @goankur 's PR here and see what
performance looks like across machines, but will be out of town for a bit.
There is a script to run the be
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236809947
And i see from playing around with compiler versions, the advantage of
intrinsics approach: although I worry how many variants we'd maintain. it would
give stability across releasing lucen
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236739249
Here is my proposal visually: https://godbolt.org/z/a9T8YrroY
As you can see, by passing `-march=cascadelake` it generates VNNI
instructions.
IMO, no need for any intrinsics anyw
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236647191
> I avoided it at the time given the toolchain that we were using, but it's
a good option which I'll reevaluate.
It should work well with any modern gcc (@goankur uses gcc 10 here).
ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236540457
Could Lucene ever have this directly in one of its modules? We currently
use the `FlatVectorsScorer` to plugin the "native code optimized" alternative,
when scoring Scalar Quantize
ChrisHegarty commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2236530280
> > With the updated compile flags, the performance of auto-vectorized code
is slightly better than explicitly vectorized code (see results). Interesting
thing to note is that both
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1682041365
##
lucene/core/build.gradle:
##
@@ -14,12 +14,59 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+plug
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2235209531
> With the updated compile flags, the performance of auto-vectorized code is
slightly better than explicitly vectorized code (see results). Interesting
thing to note is that both C-based i
goankur commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2235177271
> Do we even need to use intrinsics? function is so simple that the compiler
seems to do the right thing, e.g. use `SDOT` dot production instruction, given
the correct flags:
>
>
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1682009005
##
lucene/core/build.gradle:
##
@@ -14,10 +14,43 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+pl
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1680841649
##
lucene/core/build.gradle:
##
@@ -14,10 +14,43 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+plug
goankur commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1680353139
##
lucene/core/build.gradle:
##
@@ -14,10 +14,43 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+pl
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1678799350
##
lucene/core/build.gradle:
##
@@ -14,10 +14,43 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+plug
rmuir commented on code in PR #13572:
URL: https://github.com/apache/lucene/pull/13572#discussion_r1678790088
##
lucene/core/build.gradle:
##
@@ -14,10 +14,43 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
+plug
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2230061569
I haven't benchmarked, just seems `SDOT` is the one to optimize for, and GCC
can both recognize the code shape and autovectorize to it without hassle.
my cheap 2021 phone has `asimd
rmuir commented on PR #13572:
URL: https://github.com/apache/lucene/pull/13572#issuecomment-2230045004
Do we even need to use intrinsics? function is so simple that the compiler
seems to do the right thing, e.g. use `SDOT` dot production instruction, given
the correct flags:
https://
goankur opened a new pull request, #13572:
URL: https://github.com/apache/lucene/pull/13572
Credit:
https://www.elastic.co/search-labs/blog/vector-similarity-computations-ludicrous-speed
Implement vectorized dot product in native C code using Neon intrinsics
### Descri
87 matches
Mail list logo