date:20250325

[PR] Make DenseConjunctionBulkScorer align scoring windows with #docIDRunEnd(). [lucene]

2025-03-25 Thread via GitHub



jpountz opened a new pull request, #14400:
URL: https://github.com/apache/lucene/pull/14400

   This improves the way how `DenseConjunctionBulkScorer` computes scoring 
windows by aligning the end of the window with the `#docIDRunEnd()` of its 
clauses, as long as it would result in a window that is at least half the 
expected size.
   
   This helps reduce the number of clauses to evaluate per window in some cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

2025-03-25 Thread via GitHub



alessandrobenedetti commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2751006045

   > > do you confirm that, according to your knowledge, any relevant and 
active work toward multi-valued vectors in Lucene is effectively aggregated 
here?
   > 
   > @alessandrobenedetti I think so. This is the latest stab at it.
   > 
   > > Main concern is still related to ordinals to become long as far as I can 
see :)
   > 
   > Indeed, I just don't see how Lucene can actually support multi-value 
vectors without switching to long ordinals for the vectors. Otherwise, we 
enforce some limitation on the number of vectors per segment, or some 
limitation on the number of vectors per doc (e.g. every doc can only have 
256/65535 vectors).
   > 
   > Making HNSW indexing & merging ~2x (given other constants, it might not be 
exactly 2x, maybe a little less) more expensive for heap usage is a pretty 
steep cost. Especially for something I am not sure how many folks will actually 
use.
   
   I agree, I don't think it makes sense to deteriorate single-valued 
performance at all (didn't investigate that, but I trust your judgement in 
terms of the int->long ordinal impact, in case you want me to double check let 
me know).
   
   Another option I was pondering is adding a new field type dedicated to 
multi-valued vectors.
   Sure, there will be tons of classes to "duplicate" and make multi-valued 
compliant, but I believe we'll be able to re-use most of the code, so a huge 
number of classes but not a massive new code quantity (hopefully).
   Before even exploring this, I want to better check the current parent join 
approach i.e. native multi-valued, needs to bring advantages (mostly being 
faster in retrieving top-K 'parent' documents), if not, it won't make much 
sense to do this huge amount of work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-25 Thread via GitHub



dweiss commented on issue #14257:
URL: https://github.com/apache/lucene/issues/14257#issuecomment-2751059970

   Nice!
   
   So... google java format has this option, at least in the cmd line version:
   
![Image](https://github.com/user-attachments/assets/ffb7ebe1-c495-4411-8e7b-f3d8b176aeb4)
   
   If nothing else works, we could just make a multipass and format javadocs 
using a different tool than 
   the rest of the code... Or fork gjf and implement proper javadoc formatting, 
which should be a fun project to work on.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Add support for two-phase iterators to DenseConjunctionBulkScorer. [lucene]

2025-03-25 Thread via GitHub



jpountz merged PR #14359:
URL: https://github.com/apache/lucene/pull/14359


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] [Draft] Support Multi-Vector HNSW Search via Flat Vector Storage [lucene]

2025-03-25 Thread via GitHub



vigyasharma commented on PR #14173:
URL: https://github.com/apache/lucene/pull/14173#issuecomment-2751872315

   > Another option I was pondering is adding a new field type dedicated to 
multi-valued vectors.
   
   I tried this in my first stab at this issue 
(https://github.com/apache/lucene/pull/13525). IIRC, one concern with a 
separate field, was that it limits users from converting their previously 
single-valued fields to multi-valued vectors later if they need to. And since 
single-valued is a base case of multi-valued, why would anyone even use the 
single valued fields.
   The idea in this PR was to treat single-valued as an optimization over 
multi-valued vectors, that can be turned on/off by a flag in stored metadata.
   
   FWIW, the PR (#13525) has pieces to use the separate field, and shows the 
extent of duplication across classes (it's not very much). I had only added 
support for ColBERT style dependent multi-vectors, but that can be extended 
with the independent vector pieces in this PR.
   
   ..
   
   > Before even exploring this, I want to better check the current parent join 
approach i.e. native multi-valued, needs to bring advantages (mostly being 
faster in retrieving top-K 'parent' documents), 
   
   Agreed. The next step for this PR is to benchmark parent-join runs and see 
an improvement, esp. in cases where we need query time scoring on top of all 
the vector values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Create vectorized versions of ScalarQuantizer.quantize and recalculateCorrectiveOffset [lucene]

2025-03-25 Thread via GitHub



thecoop commented on code in PR #14304:
URL: https://github.com/apache/lucene/pull/14304#discussion_r2012314179


##
lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java:
##
@@ -234,4 +234,79 @@ public static long int4BitDotProductImpl(byte[] q, byte[] 
d) {
 }
 return ret;
   }
+
+  @Override
+  public float minMaxScalarQuantize(
+  float[] vector, byte[] dest, float scale, float alpha, float 
minQuantile, float maxQuantile) {
+return new ScalarQuantizer(alpha, scale, minQuantile, 
maxQuantile).quantize(vector, dest, 0);
+  }
+
+  @Override
+  public float recalculateScalarQuantizationOffset(
+  byte[] vector,
+  float oldAlpha,
+  float oldMinQuantile,
+  float scale,
+  float alpha,
+  float minQuantile,
+  float maxQuantile) {
+return new ScalarQuantizer(alpha, scale, minQuantile, maxQuantile)
+.recalculateOffset(vector, 0, oldAlpha, oldMinQuantile);
+  }
+
+  static class ScalarQuantizer {

Review Comment:
   It's referenced by `PanamaVectorUtilSupport` to do the tail



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Speed up histogram collection in a similar way as disjunction counts. [lucene]

2025-03-25 Thread via GitHub



gsmiller commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2751652120

   +1 to this optimization. Love the idea!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-25 Thread via GitHub



jpountz commented on PR #14401:
URL: https://github.com/apache/lucene/pull/14401#issuecomment-2751630823

   @epotyom You may be interested in this, this allows computing aggregates in 
sub-linear time respective to the number of matching docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Make DenseConjunctionBulkScorer align scoring windows with #docIDRunEnd(). [lucene]

2025-03-25 Thread via GitHub



gf2121 commented on code in PR #14400:
URL: https://github.com/apache/lucene/pull/14400#discussion_r2012455948


##
lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java:
##
@@ -171,37 +171,36 @@ private int scoreWindow(
   }
 }
 
-if (acceptDocs == null) {
-  int minDocIDRunEnd = max;
-  for (DisiWrapper w : iterators) {
-if (w.docID() > min) {
-  minDocIDRunEnd = min;
-  break;
-} else {
-  minDocIDRunEnd = Math.min(minDocIDRunEnd, w.docIDRunEnd());
-}
-  }
-
-  if (minDocIDRunEnd - min >= WINDOW_SIZE / 2) {
-// We have a large range of doc IDs that all match.
-rangeDocIdStream.from = min;
-rangeDocIdStream.to = minDocIDRunEnd;
-collector.collect(rangeDocIdStream);
-return minDocIDRunEnd;
-  }
-}
-
-int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
-
+// Partition clauses of the conjunction into:
+//  - clauses that don't fully match the first half of the window and get 
evaluated via
+// #loadIntoBitSet or leaf-frog,
+//  - other clauses that are used to compute the greatest possible window 
size that they fully
+// match.
+// This logic helps align scoring windows with the natural #docIDRunEnd() 
boundaries of the
+// data, which helps evaluate fewer clauses per window - without allowing 
windows to become too
+// small thanks to the WINDOW_SIZE/2 threshold.
+int minDocIDRunEnd = max;
 for (DisiWrapper w : iterators) {
-  if (w.docID() > min || w.docIDRunEnd() < bitsetWindowMax) {
+  int docIdRunEnd = w.docIDRunEnd();
+  if (w.docID() > min || (docIdRunEnd - min) < WINDOW_SIZE / 2) {
 windowApproximations.add(w.approximation());
 if (w.twoPhase() != null) {
   windowTwoPhases.add(w.twoPhase());
 }
+  } else {
+minDocIDRunEnd = Math.min(minDocIDRunEnd, docIdRunEnd);
   }
 }
 
+if (acceptDocs == null && windowApproximations.isEmpty()) {

Review Comment:
   Out of curiosity and not related to this PR:

   Would it be worth dealing `acceptDocs != null` here as well so that we won't 
need to call `intoBitset`?



##
lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java:
##
@@ -171,37 +171,36 @@ private int scoreWindow(
   }
 }
 
-if (acceptDocs == null) {
-  int minDocIDRunEnd = max;
-  for (DisiWrapper w : iterators) {
-if (w.docID() > min) {
-  minDocIDRunEnd = min;
-  break;
-} else {
-  minDocIDRunEnd = Math.min(minDocIDRunEnd, w.docIDRunEnd());
-}
-  }
-
-  if (minDocIDRunEnd - min >= WINDOW_SIZE / 2) {
-// We have a large range of doc IDs that all match.
-rangeDocIdStream.from = min;
-rangeDocIdStream.to = minDocIDRunEnd;
-collector.collect(rangeDocIdStream);
-return minDocIDRunEnd;
-  }
-}
-
-int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
-
+// Partition clauses of the conjunction into:
+//  - clauses that don't fully match the first half of the window and get 
evaluated via
+// #loadIntoBitSet or leaf-frog,
+//  - other clauses that are used to compute the greatest possible window 
size that they fully
+// match.
+// This logic helps align scoring windows with the natural #docIDRunEnd() 
boundaries of the
+// data, which helps evaluate fewer clauses per window - without allowing 
windows to become too
+// small thanks to the WINDOW_SIZE/2 threshold.
+int minDocIDRunEnd = max;
 for (DisiWrapper w : iterators) {
-  if (w.docID() > min || w.docIDRunEnd() < bitsetWindowMax) {
+  int docIdRunEnd = w.docIDRunEnd();
+  if (w.docID() > min || (docIdRunEnd - min) < WINDOW_SIZE / 2) {

Review Comment:
   > I believe that it only makes a difference when max-min < WINDOW_SIZE, 
where more clauses would now get evaluated
   
   Was this line making the difference and could be addressed by something like 
following code? I'm OK either way :)
   ```
   int minRunEnd = max;
   final int minRunEndThreshold = Math.min(min + WINDOW_SIZE / 2, max);
   for (DisiWrapper w : iterators) {
 int docIdRunEnd = w.docIDRunEnd();
 if (w.docID() > min || docIdRunEnd < minRunEndThreshold) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Disable sort optimization when tracking all docs [lucene]

2025-03-25 Thread via GitHub



bugmakerr commented on PR #14395:
URL: https://github.com/apache/lucene/pull/14395#issuecomment-2750983066

   > The change looks correct to me. With recent changes to allow clauses that 
match all docs to remove themselves from a conjunction, it should be possible 
to achieve something similar by implementing `#docIDRunEnd()` on competitive 
iterators. I need to think a bit more about the pros and cons of these two 
approaches.
   
   @jpountz If I understand correctly, I think both optimizations can be 
implemented at the same time, and there is no conflict between the two. If we 
know we can't skip any docs before collection, then there's no need to maintain 
competitiveIterator-related data, and it helps for implementations that don't 
benefit from `docIDRunEnd`. Instead, `docIDRunEnd` can implement the skip logic 
during runtime.
   
   Of course, the current implementation only informs the comparator once, but 
if we could inform each segment separately, we could also disable/enable sort 
on the fly based on the current total hits and max doc of current segment.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Make DenseConjunctionBulkScorer align scoring windows with #docIDRunEnd(). [lucene]

2025-03-25 Thread via GitHub



jpountz commented on code in PR #14400:
URL: https://github.com/apache/lucene/pull/14400#discussion_r2012300061


##
lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java:
##
@@ -171,27 +171,30 @@ private int scoreWindow(
   }
 }
 
-if (acceptDocs == null) {
-  int minDocIDRunEnd = max;
-  for (DisiWrapper w : iterators) {
-if (w.docID() > min) {
-  minDocIDRunEnd = min;
-  break;
-} else {
-  minDocIDRunEnd = Math.min(minDocIDRunEnd, w.docIDRunEnd());
+int minDocIDRunEnd = max;
+int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
+for (DisiWrapper w : iterators) {
+  if (w.docID() > min) {
+minDocIDRunEnd = min;
+  } else {
+int docIDRunEnd = w.docIDRunEnd();
+minDocIDRunEnd = Math.min(minDocIDRunEnd, docIDRunEnd);
+// If we can find one clause that matches over more than half the 
window then we truncate
+// the window to the run end of this clause as the benefits of 
evaluating one less clause
+// likely dominate the overhead of using a smaller window.
+if (docIDRunEnd - min >= WINDOW_SIZE / 2) {
+  bitsetWindowMax = Math.min(bitsetWindowMax, docIDRunEnd);
 }
   }
-
-  if (minDocIDRunEnd - min >= WINDOW_SIZE / 2) {
-// We have a large range of doc IDs that all match.
-rangeDocIdStream.from = min;
-rangeDocIdStream.to = minDocIDRunEnd;
-collector.collect(rangeDocIdStream);
-return minDocIDRunEnd;
-  }
 }
 
-int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
+if (acceptDocs == null && minDocIDRunEnd >= bitsetWindowMax) {

Review Comment:
   Yes, if all clauses fully match more than the next WINDOW_SIZE docs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Make DenseConjunctionBulkScorer align scoring windows with #docIDRunEnd(). [lucene]

2025-03-25 Thread via GitHub



jpountz commented on PR #14400:
URL: https://github.com/apache/lucene/pull/14400#issuecomment-2751702590

   Thank you. I believe that it only makes a difference when `max-min < 
WINDOW_SIZE`, where more clauses would now get evaluated, but simplicity is 
more important so I applied your suggestion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-25 Thread via GitHub



rmuir commented on issue #14257:
URL: https://github.com/apache/lucene/issues/14257#issuecomment-2751070467

   I think in the markdown case, the bug I saw was that it didn't treat `///` 
as javadoc but as an ordinary inline comment. But I can experiment with the 
option still.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-25 Thread via GitHub



dweiss commented on issue #14257:
URL: https://github.com/apache/lucene/issues/14257#issuecomment-2751317728

   https://github.com/google/google-java-format/issues/1193
   > Disabling Javadoc formatting doesn't prevent either issue.
   
   So it seems it's broken entirely. Argh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[PR] Enable collectors to take advantage of pre-aggregated data. [lucene]

2025-03-25 Thread via GitHub



jpountz opened a new pull request, #14401:
URL: https://github.com/apache/lucene/pull/14401

   This introduces `LeafCollector#collectRange`, which is typically useful to 
take advantage of the pre-aggregated data exposed in `DocValuesSkipper`. At the 
moment, `DocValuesSkipper` only exposes per-block min and max values, but we 
could easily extend it to record sums and value counts as well.
   
   This `collectRange` method would be called if there are no deletions in the 
segment by:
- queries that rewrite to a `MatchAllDocsQuery` (with min=0 and max=maxDoc),
- `PointRangeQuery` on segments that fully match the range (typical for 
time-based data),
- doc-value range queries and conjunctions of doc-value range queries on 
fields that enable sparse indexing and correlate with the index sort.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Pack file pointers when merging BKD trees [lucene]

2025-03-25 Thread via GitHub



iverase merged PR #14393:
URL: https://github.com/apache/lucene/pull/14393


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Reduce memory usage when merging bkd trees [lucene]

2025-03-25 Thread via GitHub



iverase closed issue #14382: Reduce memory usage when merging bkd trees
URL: https://github.com/apache/lucene/issues/14382


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Reduce memory usage when merging bkd trees [lucene]

2025-03-25 Thread via GitHub



iverase commented on issue #14382:
URL: https://github.com/apache/lucene/issues/14382#issuecomment-2750449881

   We are using more dense data structures now, in particular for the 
OneDimensionBKDWriter. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Support modifying segmentInfos.counter in IndexWriter [lucene]

2025-03-25 Thread via GitHub



vigyasharma commented on issue #14362:
URL: https://github.com/apache/lucene/issues/14362#issuecomment-2752804917

   Thanks @guojialiang92 . Is the plan here to support creating an IndexWriter 
with a supplied value of `counter`, say `N`, so that all it's commit 
generations are `>=N` i.e. `segments_N, segments_N+1, ...` and so on ? 
   
   To confirm my understanding, when a primary dies, one cannot really 
guarantee that all replicas were fully caught up. If the winning replica (new 
primary) was lagging behind, and you simply continue to use its 
`SegmentInfos#counter`, it might end up overwriting some segment files in other 
replicas. So you use raft to select the right counter value and start segments 
from that value. 
   
   This is specifically a problem for segment replication. Regular document 
replication works fine because each document is reindexed anyway and segment 
files are not copied over. Is that more or less correct?
   
   Anyway, I don't see any problems with this support and it does have a valid 
use-case. If you want to raise a PR, I can help review it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Speed up histogram collection in a similar way as disjunction counts. [lucene]

2025-03-25 Thread via GitHub



jpountz commented on PR #14273:
URL: https://github.com/apache/lucene/pull/14273#issuecomment-2752610585

   Quick update: we now have more queries that collect hits using 
`collect(DocIdStream)`, which makes this optimization more appealing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Can we use Panama Vector API for quantizing vectors? [lucene]

2025-03-25 Thread via GitHub



benwtrent closed issue #13922: Can we use Panama Vector API for quantizing 
vectors?
URL: https://github.com/apache/lucene/issues/13922


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Can we use Panama Vector API for quantizing vectors? [lucene]

2025-03-25 Thread via GitHub



benwtrent closed issue #13922: Can we use Panama Vector API for quantizing 
vectors?
URL: https://github.com/apache/lucene/issues/13922


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2025-03-25 Thread via GitHub



github-actions[bot] commented on PR #13782:
URL: https://github.com/apache/lucene/pull/13782#issuecomment-2752825545

   This PR has not had activity in the past 2 weeks, labeling it as stale. If 
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you 
for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Break the loop when segment is fully deleted by prior delTerms or delQueries [lucene]

2025-03-25 Thread via GitHub



github-actions[bot] commented on PR #13398:
URL: https://github.com/apache/lucene/pull/13398#issuecomment-2752825874

   This PR has not had activity in the past 2 weeks, labeling it as stale. If 
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you 
for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Make DenseConjunctionBulkScorer align scoring windows with #docIDRunEnd(). [lucene]

2025-03-25 Thread via GitHub



jpountz commented on code in PR #14400:
URL: https://github.com/apache/lucene/pull/14400#discussion_r2012975372


##
lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java:
##
@@ -171,37 +171,36 @@ private int scoreWindow(
   }
 }
 
-if (acceptDocs == null) {
-  int minDocIDRunEnd = max;
-  for (DisiWrapper w : iterators) {
-if (w.docID() > min) {
-  minDocIDRunEnd = min;
-  break;
-} else {
-  minDocIDRunEnd = Math.min(minDocIDRunEnd, w.docIDRunEnd());
-}
-  }
-
-  if (minDocIDRunEnd - min >= WINDOW_SIZE / 2) {
-// We have a large range of doc IDs that all match.
-rangeDocIdStream.from = min;
-rangeDocIdStream.to = minDocIDRunEnd;
-collector.collect(rangeDocIdStream);
-return minDocIDRunEnd;
-  }
-}
-
-int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
-
+// Partition clauses of the conjunction into:
+//  - clauses that don't fully match the first half of the window and get 
evaluated via
+// #loadIntoBitSet or leaf-frog,
+//  - other clauses that are used to compute the greatest possible window 
size that they fully
+// match.
+// This logic helps align scoring windows with the natural #docIDRunEnd() 
boundaries of the
+// data, which helps evaluate fewer clauses per window - without allowing 
windows to become too
+// small thanks to the WINDOW_SIZE/2 threshold.
+int minDocIDRunEnd = max;
 for (DisiWrapper w : iterators) {
-  if (w.docID() > min || w.docIDRunEnd() < bitsetWindowMax) {
+  int docIdRunEnd = w.docIDRunEnd();
+  if (w.docID() > min || (docIdRunEnd - min) < WINDOW_SIZE / 2) {
 windowApproximations.add(w.approximation());
 if (w.twoPhase() != null) {
   windowTwoPhases.add(w.twoPhase());
 }
+  } else {
+minDocIDRunEnd = Math.min(minDocIDRunEnd, docIdRunEnd);
   }
 }
 
+if (acceptDocs == null && windowApproximations.isEmpty()) {

Review Comment:
   If accept docs are not null, we shouldn't call `intoBitSet` on any clause. 
However we'll stick to a window of size 4,096 and convert the accept docs 
`Bits` into a bit set using `Bits#applyMask`.
   
   We may be able to do better if deletions are extremely sparse, but I 
couldn't think of an obvious way of handling it and I'm not sure how much this 
case is worth optimizing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-03-25 Thread via GitHub



github-actions[bot] commented on PR #14178:
URL: https://github.com/apache/lucene/pull/14178#issuecomment-2752825072

   This PR has not had activity in the past 2 weeks, labeling it as stale. If 
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you 
for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] knn search - add tests to perform exact search when filtering does not return enough results [lucene]

2025-03-25 Thread via GitHub



benwtrent merged PR #14274:
URL: https://github.com/apache/lucene/pull/14274


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[I] New testMinMaxScalarQuantize tests failing repeatably [lucene]

2025-03-25 Thread via GitHub



benwtrent opened a new issue, #14402:
URL: https://github.com/apache/lucene/issues/14402

   ### Description
   
   ```
   TestVectorUtilSupport > testMinMaxScalarQuantize {p0=4096} FAILED
   java.lang.AssertionError:
   Expected: a numeric value within <0.004096> of <762.170654296875>
but: <762.1751708984375> differed by <4.20601562502E-4> more 
than delta <0.004096>
   at 
__randomizedtesting.SeedInfo.seed([C8353109B2DC21F4:77130CBE78E61421]:0)
   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   at org.junit.Assert.assertThat(Assert.java:964)
   at org.junit.Assert.assertThat(Assert.java:930)
   at 
org.apache.lucene.internal.vectorization.TestVectorUtilSupport.assertFloatReturningProviders(TestVectorUtilSupport.java:213)
   at 
org.apache.lucene.internal.vectorization.TestVectorUtilSupport.testMinMaxScalarQuantize(TestVectorUtilSupport.java:206)
   at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
   at java.base/java.lang.reflect.Method.invoke(Method.java:580)
   ```
   
   Seems like a simple epsilon correction. 
   
   ### Gradle command to reproduce
   
   ```
   ./gradlew :lucene:core:test --tests 
"org.apache.lucene.internal.vectorization.TestVectorUtilSupport.testMinMaxScalarQuantize
 {p0=4096}" -Ptests.jvms=5 "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 
-XX:+UseParallelGC -XX:ActiveProcessorCount=1" -Ptests.seed=C8353109B2DC21F4 
-Ptests.useSecurityManager=true -Ptests.gui=false -Ptests.file.encoding=UTF-8 
-Ptests.vectorsize=512 -Ptests.forceintegervectors=true
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] New testMinMaxScalarQuantize tests failing repeatably [lucene]

2025-03-25 Thread via GitHub



benwtrent commented on issue #14402:
URL: https://github.com/apache/lucene/issues/14402#issuecomment-2752192712

   @thecoop ping ;)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Make PointValues.intersect iterative instead of recursive [lucene]

2025-03-25 Thread via GitHub



jpountz commented on PR #14391:
URL: https://github.com/apache/lucene/pull/14391#issuecomment-2752592897

   Nightly benchmarks report a tiny slowdown for IntNRQ and CountFilteredIntNRQ 
(https://benchmarks.mikemccandless.com/2025.03.24.18.05.19.html) nevertheless I 
agree with your point that it's better to make this logic iterative rather than 
recursive.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Use @snippet javadoc tag for snippets [lucene]

2025-03-25 Thread via GitHub



rmuir commented on issue #14257:
URL: https://github.com/apache/lucene/issues/14257#issuecomment-2751009070

   @dweiss thank you so much for that starter commit for evaluation. I will try 
it tonight and fire up eclipse and see what our options are.
   
   I finally finished parser (https://github.com/rmuir/tree-sitter-javadoc) and 
am now looking at options to (ab)use it, to convert our docs to markdown 
automatically, when I discovered that google-java-format will mess up markdown 
comments, e.g. too-long `///` will wrap to another line with `//` and break 
everything. 
   
   So currently, there is no way to escape from the prettier because neither 
`@snippet` nor markdown can work with the formatter, so no way to make a patch 
that passes build.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Make DenseConjunctionBulkScorer align scoring windows with #docIDRunEnd(). [lucene]

2025-03-25 Thread via GitHub



gf2121 commented on code in PR #14400:
URL: https://github.com/apache/lucene/pull/14400#discussion_r2012151639


##
lucene/core/src/java/org/apache/lucene/search/DenseConjunctionBulkScorer.java:
##
@@ -171,27 +171,30 @@ private int scoreWindow(
   }
 }
 
-if (acceptDocs == null) {
-  int minDocIDRunEnd = max;
-  for (DisiWrapper w : iterators) {
-if (w.docID() > min) {
-  minDocIDRunEnd = min;
-  break;
-} else {
-  minDocIDRunEnd = Math.min(minDocIDRunEnd, w.docIDRunEnd());
+int minDocIDRunEnd = max;
+int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
+for (DisiWrapper w : iterators) {
+  if (w.docID() > min) {
+minDocIDRunEnd = min;
+  } else {
+int docIDRunEnd = w.docIDRunEnd();
+minDocIDRunEnd = Math.min(minDocIDRunEnd, docIDRunEnd);
+// If we can find one clause that matches over more than half the 
window then we truncate
+// the window to the run end of this clause as the benefits of 
evaluating one less clause
+// likely dominate the overhead of using a smaller window.
+if (docIDRunEnd - min >= WINDOW_SIZE / 2) {
+  bitsetWindowMax = Math.min(bitsetWindowMax, docIDRunEnd);
 }
   }
-
-  if (minDocIDRunEnd - min >= WINDOW_SIZE / 2) {
-// We have a large range of doc IDs that all match.
-rangeDocIdStream.from = min;
-rangeDocIdStream.to = minDocIDRunEnd;
-collector.collect(rangeDocIdStream);
-return minDocIDRunEnd;
-  }
 }
 
-int bitsetWindowMax = (int) Math.min(max, (long) min + WINDOW_SIZE);
+if (acceptDocs == null && minDocIDRunEnd >= bitsetWindowMax) {

Review Comment:
   Could `minDocIDRunEnd` ever be bigger than `bitsetWindowMax` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Create vectorized versions of ScalarQuantizer.quantize and recalculateCorrectiveOffset [lucene]

2025-03-25 Thread via GitHub



benwtrent commented on code in PR #14304:
URL: https://github.com/apache/lucene/pull/14304#discussion_r201706


##
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##
@@ -334,4 +334,45 @@ public static int findNextGEQ(int[] buffer, int target, 
int from, int to) {
 assert IntStream.range(0, to - 1).noneMatch(i -> buffer[i] > buffer[i + 
1]);
 return IMPL.findNextGEQ(buffer, target, from, to);
   }
+
+  /**
+   * Quantizes {@code vector}, putting the result into {@code dest}.
+   *
+   * @param vector the vector to quantize
+   * @param dest the destination vector, can be null
+   * @param scale the scaling factor
+   * @param alpha the alpha value
+   * @param minQuantile the lower quantile of the distribution
+   * @param maxQuantile the upper quantile of the distribution
+   * @return the corrective offset that needs to be applied to the score
+   */
+  public static float quantize(

Review Comment:
   lets unify the name here with the implementation



##
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##
@@ -334,4 +334,45 @@ public static int findNextGEQ(int[] buffer, int target, 
int from, int to) {
 assert IntStream.range(0, to - 1).noneMatch(i -> buffer[i] > buffer[i + 
1]);
 return IMPL.findNextGEQ(buffer, target, from, to);
   }
+
+  /**
+   * Quantizes {@code vector}, putting the result into {@code dest}.
+   *
+   * @param vector the vector to quantize
+   * @param dest the destination vector, can be null
+   * @param scale the scaling factor
+   * @param alpha the alpha value
+   * @param minQuantile the lower quantile of the distribution
+   * @param maxQuantile the upper quantile of the distribution
+   * @return the corrective offset that needs to be applied to the score
+   */
+  public static float quantize(
+  float[] vector, byte[] dest, float scale, float alpha, float 
minQuantile, float maxQuantile) {
+assert vector.length == dest.length;

Review Comment:
   Let's throw an actual error here, illegal argument?



##
lucene/core/src/java/org/apache/lucene/internal/vectorization/DefaultVectorUtilSupport.java:
##
@@ -234,4 +234,79 @@ public static long int4BitDotProductImpl(byte[] q, byte[] 
d) {
 }
 return ret;
   }
+
+  @Override
+  public float minMaxScalarQuantize(
+  float[] vector, byte[] dest, float scale, float alpha, float 
minQuantile, float maxQuantile) {
+return new ScalarQuantizer(alpha, scale, minQuantile, 
maxQuantile).quantize(vector, dest, 0);
+  }
+
+  @Override
+  public float recalculateScalarQuantizationOffset(
+  byte[] vector,
+  float oldAlpha,
+  float oldMinQuantile,
+  float scale,
+  float alpha,
+  float minQuantile,
+  float maxQuantile) {
+return new ScalarQuantizer(alpha, scale, minQuantile, maxQuantile)
+.recalculateOffset(vector, 0, oldAlpha, oldMinQuantile);
+  }
+
+  static class ScalarQuantizer {

Review Comment:
   private?



##
lucene/core/src/java/org/apache/lucene/util/VectorUtil.java:
##
@@ -334,4 +334,45 @@ public static int findNextGEQ(int[] buffer, int target, 
int from, int to) {
 assert IntStream.range(0, to - 1).noneMatch(i -> buffer[i] > buffer[i + 
1]);
 return IMPL.findNextGEQ(buffer, target, from, to);
   }
+
+  /**
+   * Quantizes {@code vector}, putting the result into {@code dest}.

Review Comment:
   Scalar quantizes.



##
lucene/core/src/java21/org/apache/lucene/internal/vectorization/PanamaVectorUtilSupport.java:
##
@@ -907,4 +907,97 @@ public static long int4BitDotProduct128(byte[] q, byte[] 
d) {
 }
 return subRet0 + (subRet1 << 1) + (subRet2 << 2) + (subRet3 << 3);
   }
+
+  @Override
+  public float minMaxScalarQuantize(
+  float[] vector, byte[] dest, float scale, float alpha, float 
minQuantile, float maxQuantile) {
+float correction = 0;

Review Comment:
   lets add an assert here on vector.length & dest.length. Earlier up stream, 
we should throw an actual production error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [I] Support modifying segmentInfos.counter in IndexWriter [lucene]

2025-03-25 Thread via GitHub



guojialiang92 commented on issue #14362:
URL: https://github.com/apache/lucene/issues/14362#issuecomment-2753129754

   Thanks @vigyasharma. Your understanding is correct (**This is specifically a 
problem for segment replication**). 
   
   From an implementation point of view, similar to the current 
`IndexWriter#advanceSegmentInfosVersion`, I want to provide a 
`IndexWriter#advanceSegmentInfosCounter`. It may be more flexible to use, do 
you think it can be changed like this?
   I am willing to submit a PR and give it to you for review.
   
   Looking forward to your reply.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Speed up advancing within a sparse block in IndexedDISI. [lucene]

2025-03-25 Thread via GitHub



vsop-479 commented on PR #14371:
URL: https://github.com/apache/lucene/pull/14371#issuecomment-2753155562

   > a bench in jmh will be great. 
   
   I measured it with `AdvanceSparseDISIBenchmark`: 
   
   
   Benchmark Mode  CntScore   
Error   Units
   AdvanceSparseDISIBenchmark.advance   thrpt   15  669.502 ± 
4.531  ops/ms
   AdvanceSparseDISIBenchmark.advanceBinarySearch   thrpt   15  358.620 ± 
1.102  ops/ms
   AdvanceSparseDISIBenchmark.advanceExact  thrpt   15  752.444 ± 
1.810  ops/ms
   AdvanceSparseDISIBenchmark.advanceExactBinarySearch  thrpt   15  547.818 ± 
2.278  ops/ms
   
   Even I set target docs's inteval to 10, there is still a big performance 
degrade. Maybe I use too many `disi.slice.seek` in this binary search version.
   
   >  you may find we are using VectorMask to speed up this, that was what i 
had in mind - get a MemorySegment slice if it is not null, and play it with 
VectorMask.
   
   I will try `VectorMask` when I get a chance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Optimize slice calculation in IndexSearcher a little [lucene]

2025-03-25 Thread via GitHub



github-actions[bot] commented on PR #13860:
URL: https://github.com/apache/lucene/pull/13860#issuecomment-2752825465

   This PR has not had activity in the past 2 weeks, labeling it as stale. If 
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you 
for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Re: [PR] Use FixedLengthBytesRefArray in OneDimensionBKDWriter to hold split values [lucene]

2025-03-25 Thread via GitHub



iverase merged PR #14383:
URL: https://github.com/apache/lucene/pull/14383


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

37 matches

Mail list logo