ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558674081
Latest benchmark results.
intel rocketlake (512-bit vectors):
```
Benchmark(size) Mode Cnt Score Error
Units
BinaryCosine
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201701447
##
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201706152
##
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558725059
Apologies, I'm don't have much time early this week, but hope to be more
active later in the week. I know that there are still outstanding questions
around testing, luceneutil, etc,
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201852487
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvm
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201860585
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvm
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201912353
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvmArgs '-
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201914722
##
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one o
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201916237
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvm
alessandrobenedetti commented on PR #12314:
URL: https://github.com/apache/lucene/pull/12314#issuecomment-1558921019
Thanks @benwtrent and no rush, I am not planning to commit to this work
anytime soon.
I'll need various validations from other committers on the various areas of
the code
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201931143
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvmAr
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201931143
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvmAr
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201931143
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvmAr
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201940187
##
lucene/core/src/java20/org/apache/lucene/util/VectorUtilPanamaProvider.java:
##
@@ -0,0 +1,455 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) unde
uschindler commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201940561
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvmAr
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201948213
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvm
ChrisHegarty commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201916237
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvm
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558965655
I renamed the constants a bit (e.g. SPECIES at top was inconsistent with the
rest)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1558966617
I will now write a test that compares, iff the panama provider is used.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
rmuir commented on code in PR #12311:
URL: https://github.com/apache/lucene/pull/12311#discussion_r1201991364
##
gradle/testing/defaults-tests.gradle:
##
@@ -119,11 +119,16 @@ allprojects {
if (rootProject.runtimeJavaVersion < JavaVersion.VERSION_16) {
jvmArgs '-
javanna merged PR #12233:
URL: https://github.com/apache/lucene/pull/12233
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.apa
javanna commented on PR #12233:
URL: https://github.com/apache/lucene/pull/12233#issuecomment-1558987882
thanks @AndreyBozhko !
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific commen
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559022634
> Apologies, I'm don't have much time early this week, but hope to be more
active later in the week. I know that there are still outstanding questions
around testing, luceneutil, etc, but
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559035945
couple of random TODOs i had to look into, we can think about:
* disable panama provider unless user has at least 128-bit vectors? Nobody
has benched the case where SPECIES_PREFERRED is
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559067381
I added a test comparing both providers (executes if the VectorUtil's
provider is different from Lucene's.
--
This is an automated message from the Apache Git Service.
To respond to
mikemccand commented on issue #12321:
URL: https://github.com/apache/lucene/issues/12321#issuecomment-1559085083
Oooh I love that idea!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559159143
I ran this on the `luceneutil` nightly benchmark box (`beast3`):
This is an AMD Ryzen Threadripper 3990X 64-Core Processor, with this long
list of CPU flags AND bugs:
```
mikemccand commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559223951
I also ran the benchmark on a Skylake Linux box I have:
```
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 94
model name
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559226152
> I ran this on the `luceneutil` nightly benchmark box (`beast3`):
Anybody with a real Luceneutil vector-benchmark of the whole luceneutil
testsuite?
--
This is an automa
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559308415
> Anybody with a real Luceneutil vector-benchmark of the whole luceneutil
testsuite?
I think the benefits won't look very impressive because it only uses 100
dims. This seems to be
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559489000
> disable panama provider unless user has at least 128-bit vectors? Nobody
has benched the case where SPECIES_PREFERRED is 64-bits: that's the case where
vectorization isnt enabled
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559562832
> you can see this stuff in benchmarks above where e.g. 128 dims is faster
than 100, etc.
should we zero-pad before computing the dot-products? It wouldn't affect the
result and
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559566170
> Agreed. I don't think we care enough to try this - what's the point.
Pushed
[74a9782](https://github.com/apache/lucene/pull/12311/commits/74a9782240dc3b0eb99f19c324be304117371e46)
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559571108
> > you can see this stuff in benchmarks above where e.g. 128 dims is faster
than 100, etc.
>
> should we zero-pad before computing the dot-products? It wouldn't affect
the res
ChrisHegarty commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559578873
@uschindler
> Cool thanks. I was about to do the same, my idea was a bit different: Add
a normal virtual method "isSupported()" to the interface and implement it
returning t
msokolov commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559582073
I'm trying to run luceneutil benchmarks but there are confounding factors -
we spend a *lot* of time parsing the vector dictionary to create query-vectors.
I think we need to fix this b
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559669255
> @uschindler
>
> > Cool thanks. I was about to do the same, my idea was a bit different:
Add a normal virtual method "isSupported()" to the interface and implement it
returnin
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559765890
> should we zero-pad before computing the dot-products? It wouldn't affect
the result and sounds like it would be faster
I don't think so. the code wouldnt even know what size to pad
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559831627
Sorry for again changing. I was not happy with the bitsize code in the
lookup function. I used a very simple approach: The constructor of the provider
throws UnsupportedOperationExcep
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559834404
I tested this with changing if statement to 1280 :-) Message looks fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559863229
> > should we zero-pad before computing the dot-products? It wouldn't affect
the result and sounds like it would be faster
>
> I don't think so. the code wouldnt even know what
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559867708
All bitsizes in the lengthy flame war on mailinglist was mostly powers of
two (except maybe 1536). But we could maybe split into different chunk sizes?
So to multiply size 768 vectors
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559872383
Sorry ignore my last comment. The power of 2 was wrong.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
UR
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1559929746
> Sorry for again changing. I was not happy with the bitsize code in the
lookup function. I used a very simple approach: The constructor of the provider
throws UnsupportedOperationExceptio
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-156194
Assert is fine. Now that the check is in ctor we know for sure it will be at
least 128.
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
RS146BIJAY commented on issue #12228:
URL: https://github.com/apache/lucene/issues/12228#issuecomment-1560009080
@rmuir As of now is there is any way User can delete these unreferenced
files on their end using Lucene without closing IndexWriter?
--
This is an automated message from the A
javanna commented on PR #12160:
URL: https://github.com/apache/lucene/pull/12160#issuecomment-1560097486
Question on this change: index searcher parallelizes search over leaf
slices, while the concurrent rewrite for knn vector query will request one
thread per segment. Is this on purpose or
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560143502
my issues with maxvectorsize/preferred was a jshell luser issue. it really
works, so its easy for me to simulate 64/128 bit vectors on my 256-bit machine.
i am still fine with disabl
zhaih commented on PR #12160:
URL: https://github.com/apache/lucene/pull/12160#issuecomment-1560145097
I think the concurrent rewrite should act the same way(one thread per
slice) ideally. Do you want to open a PR on that? Or open an issue and I
can probably work on it recently
O
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560162059
if you want to see what i mean, try running `java -XX:UseAVX=0 -jar
target/vectorbench.jar -p size=1024`:
```
Benchmark(size) Mode Cnt ScoreErr
uschindler commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560166430
You could use the RAMUsageEstimator code to look into VM flags.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and us
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560167503
my concern is that if i spin up a QEMU that doesn't pass thru AVX flags in
CPUID, that the same thing will happen without any java VM flags.
--
This is an automated message from the Apac
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560173828
and yeah, given the perf traps here, lets keep the 128-bit check. I am
afraid of the unknown (e.g. something other than x86/arm) and i think its
better to err on the side of caution and on
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560207056
This isn't a theoretical issue, I really think its a openjdk problem?
To reproduce it, i simply emulate a Nehalem cpu (the one right before AVX):
```
processor: 3
vendor_
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560213214
As a potential workaround I tried Uwe's suggestion to look at
RAMUsageEstimator code and ran it inside this VM, it works and seems to detect
the situation?
```
jshell> final Object v
rmuir commented on PR #12311:
URL: https://github.com/apache/lucene/pull/12311#issuecomment-1560217999
And my concern is not "oh but nehalem or others without AVX are ancient
ancient hardware". It is that if you run qemu with all its defaults and don't
specify `-cpu` flag, you get a VM with
56 matches
Mail list logo