viswanathk commented on PR #14105:
URL: https://github.com/apache/lucene/pull/14105#issuecomment-2574269727
My bad. Made the changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
jpountz commented on code in PR #14097:
URL: https://github.com/apache/lucene/pull/14097#discussion_r1904681561
##
lucene/misc/src/java/org/apache/lucene/misc/index/BpVectorReorderer.java:
##
@@ -0,0 +1,788 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or
benwtrent commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573997682
> I didnt' see a big impact on recall beyond what is typical from noise --
even with the same graph settings we see variance in recall due to randomness
in the graph creation when ther
benchaplin commented on PR #13984:
URL: https://github.com/apache/lucene/pull/13984#issuecomment-2573974895
@mikemccand I tried for a bit to recreate the scenario you're describing but
wasn't able to. I added the defensive suggestion anyway - I'll keep trying to
reproduce. Let me know if yo
msokolov commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573974658
> The numbers here are really nice. I just want to understand why they were
better, especially as recall changes, which seems to indicate that the graph
building itself is being changed
benchaplin commented on code in PR #13984:
URL: https://github.com/apache/lucene/pull/13984#discussion_r1904663757
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -2746,6 +2785,191 @@ public static Status.VectorValuesStatus testVectors(
return status;
msokolov commented on PR #13984:
URL: https://github.com/apache/lucene/pull/13984#issuecomment-2573971976
Ooh, I was wondering about this `PerFieldKnnVectorsFormat` case - thanks for
testing @mikemccand
--
This is an automated message from the Apache Git Service.
To respond to the me
benwtrent commented on PR #14085:
URL: https://github.com/apache/lucene/pull/14085#issuecomment-2573950904
Thank you for taking a stab at this @benchaplin ! I wonder if we can adjust
the algorithm to more intelligently switch between the algorithms. something
like:
- Fan out one lay
benwtrent commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573940308
If we really think `vint` is the cause, I wonder if we should switch
encoding to the `readGroupVInts` stuff?
https://github.com/apache/lucene/issues/12871
My thought around
msokolov commented on PR #14105:
URL: https://github.com/apache/lucene/pull/14105#issuecomment-2573865042
thanks! I merged the CHANGES ... but ... maybe we also need to backport the
CHANGES change :yum:
-- could add it to the backport PR?
--
This is an automated message from the Apach
msokolov merged PR #14104:
URL: https://github.com/apache/lucene/pull/14104
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
mikemccand commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573851904
I like the more efficient delta encoding theory.
Decoding `vInt` is a hotspot for HNSW graph traversal ... so if we can use 2
bytes instead of 3, or 1 byte instead of 2, thanks
mikemccand commented on code in PR #14103:
URL: https://github.com/apache/lucene/pull/14103#discussion_r1904581374
##
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:
##
@@ -428,19 +441,110 @@ public void readBytes(byte[] dst, int offset, int len)
throw
msokolov commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573831315
I will note that tests with smaller indexes don't show such dramatic
improvements - more support for the theory that graph decoding is what is
helped, because there are no real compress
msokolov commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573825773
Regarding merging luceneutil tooling - I will open a PR, but suggest we hold
off merging until this change hits Lucene
--
This is an automated message from the Apache Git Service.
To
msokolov commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573825107
I am not sure, but surmising that search performance is improved because of
some combination of (1) graph ordinal decoding being faster (since we encode
using VInts and these are now sm
benwtrent commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573818646
These are exciting numbers! Its interesting how improved search latency is
dropping the index build time.
Do we know why the searching times are so much better? Is it simply beca
mikemccand commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2573791823
@msokolov do you have changes to luceneutil's `knnPerfTest.py` to enable
this? Let's merge those upstream (to luceneutil) too ... I'm working on
getting nightly benchy to run `knnPer
mikemccand commented on PR #13984:
URL: https://github.com/apache/lucene/pull/13984#issuecomment-2573785203
Thank you for persisting on this important change @benchaplin!
I applied this PR to my local Lucene clone and ran `CheckIndex` on the
vector index created by [last night's night
viswanathk opened a new pull request, #14104:
URL: https://github.com/apache/lucene/pull/14104
Modifying the CHANGES.txt entry for #14022
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the spec
viswanathk commented on PR #14022:
URL: https://github.com/apache/lucene/pull/14022#issuecomment-2573712838
Yeah, let me make them real quick.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
gsmiller commented on code in PR #14070:
URL: https://github.com/apache/lucene/pull/14070#discussion_r1904502387
##
lucene/core/src/java/org/apache/lucene/search/DisiPriorityQueueN.java:
##
@@ -0,0 +1,230 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or m
gsmiller commented on code in PR #14039:
URL: https://github.com/apache/lucene/pull/14039#discussion_r1904467456
##
lucene/core/src/java/org/apache/lucene/search/Weight.java:
##
@@ -289,75 +262,108 @@ static int scoreRange(
}
}
- int doc = iterator.docID()
gsmiller commented on code in PR #13974:
URL: https://github.com/apache/lucene/pull/13974#discussion_r1904400249
##
lucene/sandbox/src/java/org/apache/lucene/sandbox/search/SortedSetMultiRangeQuery.java:
##
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (AS
tveasey commented on PR #14078:
URL: https://github.com/apache/lucene/pull/14078#issuecomment-2573444675
Just sticking purely to the issues raised regarding this PR and the blog Ben
linked explaining the methodology...
> Although the RaBitQ approach is conceptually rather different to
ChrisHegarty commented on PR #14078:
URL: https://github.com/apache/lucene/pull/14078#issuecomment-2573298347
In my capacity as the Lucene PMC Chair (and with explicit acknowledgment of
my current employment with Elastic, as of the date of this writing), I want to
emphasize that proper attr
benwtrent commented on code in PR #14084:
URL: https://github.com/apache/lucene/pull/14084#discussion_r1904255306
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -67,7 +70,21 @@ public static void search(
HnswGraphSearcher graphSearcher =
benwtrent commented on code in PR #14084:
URL: https://github.com/apache/lucene/pull/14084#discussion_r1904254316
##
lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphSearcher.java:
##
@@ -67,7 +70,21 @@ public static void search(
HnswGraphSearcher graphSearcher =
benwtrent commented on code in PR #14084:
URL: https://github.com/apache/lucene/pull/14084#discussion_r1904253643
##
lucene/core/src/java/org/apache/lucene/search/knn/SeededKnnCollectorManager.java:
##
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) un
cpoerschke merged PR #14098:
URL: https://github.com/apache/lucene/pull/14098
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.
benwtrent commented on code in PR #13984:
URL: https://github.com/apache/lucene/pull/13984#discussion_r1904112483
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -2746,6 +2785,191 @@ public static Status.VectorValuesStatus testVectors(
return status;
msokolov commented on PR #14022:
URL: https://github.com/apache/lucene/pull/14022#issuecomment-2573041092
@viswanathk I just merged and then belatedly realized we should also have a
CHANGES.txt entry for this - I guess it belongs under Optimizations heading --
do you want to add? And then w
msokolov merged PR #14022:
URL: https://github.com/apache/lucene/pull/14022
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
msokolov commented on PR #14022:
URL: https://github.com/apache/lucene/pull/14022#issuecomment-2573037086
sorry for the delay - holidays intervened!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go t
gaoj0017 commented on PR #14078:
URL: https://github.com/apache/lucene/pull/14078#issuecomment-2573030521
Hi @msokolov , the discussion here is not only about the blog posts but also
related to the pull request here. In this pull request (and its related blogs),
it claims a new method witho
msokolov commented on code in PR #13984:
URL: https://github.com/apache/lucene/pull/13984#discussion_r1904101920
##
lucene/core/src/java/org/apache/lucene/index/CheckIndex.java:
##
@@ -2746,6 +2785,191 @@ public static Status.VectorValuesStatus testVectors(
return status;
msokolov commented on PR #14097:
URL: https://github.com/apache/lucene/pull/14097#issuecomment-2572995493
Right, I was kind of hoping @jpountz would review, but perhaps he's out for
vacation. Most of this has already been seen and he approved the earlier PR.
The main new thing here that mig
original-brownbear merged PR #13935:
URL: https://github.com/apache/lucene/pull/13935
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscr...
38 matches
Mail list logo