Re: [PR] Backport to 9x: Initialize facet counting data structures lazily #12408 [lucene]

2024-05-01 Thread via GitHub


stefanvodita commented on PR #13300:
URL: https://github.com/apache/lucene/pull/13300#issuecomment-2088157884

   I went through this change once more to make sure it's safe to push. I would 
still like a second opinion, just in case I'm missing something, otherwise I'll 
merge this in a few days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Port over gradle setting generator from Solr [lucene]

2024-05-01 Thread via GitHub


dweiss commented on PR #12131:
URL: https://github.com/apache/lucene/pull/12131#issuecomment-2088223120

   Ah, I see your point. The only crude workaround I see for now is to set 
those props via an environment variable - these should be picked up by all 
included builds. For example, when I run:
   ```
   $ GRADLE_OPTS="-Dorg.gradle.workers.max=0" ./gradlew help
   ```
   this breaks Lucene's build, even though gradle.properties is there. So if 
you can set it in your build IDE or shell, you can override the defaults across 
all composite builds. 
   
   I really think it's kind of dumb that the number of workers can't be 
configured in settings.gradle, for example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Backport to 9x: Initialize facet counting data structures lazily #12408 [lucene]

2024-05-01 Thread via GitHub


stefanvodita commented on PR #13300:
URL: https://github.com/apache/lucene/pull/13300#issuecomment-2088360624

   Thank you for the review Mike! I did have to change a couple things:
   1. Added a new constructor to `TaxonomyFacets`, which takes a 
`FacetsCollector`, since I couldn't change the existing one.
   2. Updated `TaxonomyFacetCounts` too, which no longer exists on `main`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Improve int4 compressed comparisons performance [lucene]

2024-05-01 Thread via GitHub


benwtrent commented on PR #13321:
URL: https://github.com/apache/lucene/pull/13321#issuecomment-2088378164

   @uschindler I addressed your comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Improve int4 compressed comparisons performance [lucene]

2024-05-01 Thread via GitHub


uschindler commented on PR #13321:
URL: https://github.com/apache/lucene/pull/13321#issuecomment-2088443326

   To me changes look fine.
   
   For discussion: In my opinion the conditional code falling back to defaults 
should possibly be moved to the VectorUtil class and i think them the booleans 
could be removed and we only have two variants directly called from VectorUtil.
   
   Uwe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Improve int4 compressed comparisons performance [lucene]

2024-05-01 Thread via GitHub


benwtrent merged PR #13321:
URL: https://github.com/apache/lucene/pull/13321


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add new VectorScorer interface to vector value iterators [lucene]

2024-05-01 Thread via GitHub


benwtrent commented on PR #13181:
URL: https://github.com/apache/lucene/pull/13181#issuecomment-2088865803

   This is my first attempt at a new interface for exact search.
   
   We are missing performance improvements through quantization and have a 
fairly complex API for exact search, requiring users to iterate the vectors 
themselves and choose a correct vector similarity.
   
   This new abstraction allows a simple iterator, scorer, API for dense vector 
fields.
   
   However, it's sort of inverted from the typical API (scorers create 
iterators, but here iterators provide scorers).
   
   @msokolov what do you think?
   
   The other option I can easily see is adding another top level thing to the 
leaf reader that provides an iterative vector scorer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Fix TestHnswBitVectorsFormat.testIndexAndSearchBitVectors flakiness [lucene]

2024-05-01 Thread via GitHub


benwtrent opened a new pull request, #1:
URL: https://github.com/apache/lucene/pull/1

   The flaky test failure is due to score tie-breaking by doc id. To avoid 
requiring document id order from being consistent (thus reducing randomness in 
the test), this commit ensures all bit vectors have a unique score with the 
query vector.
   
   closes: https://github.com/apache/lucene/issues/13326


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Duplication computation for TieredMergePolicy's numDeletesToMerge [LUCENE-10041] [lucene]

2024-05-01 Thread via GitHub


dnhatn closed issue #11079: Duplication computation for TieredMergePolicy's 
numDeletesToMerge [LUCENE-10041]
URL: https://github.com/apache/lucene/issues/11079


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Duplication computation for TieredMergePolicy's numDeletesToMerge [LUCENE-10041] [lucene]

2024-05-01 Thread via GitHub


dnhatn commented on issue #11079:
URL: https://github.com/apache/lucene/issues/11079#issuecomment-2089073368

   Fixed in #12339


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [WIP] LUCENE-10002: Deprecate FacetsCollector#search helper methods as they internally use IndexSearcher#search(Query, Collector) API [lucene]

2024-05-01 Thread via GitHub


zacharymorn closed pull request #12890: [WIP] LUCENE-10002: Deprecate 
FacetsCollector#search helper methods as they internally use 
IndexSearcher#search(Query, Collector) API
URL: https://github.com/apache/lucene/pull/12890


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] GITHUB-12892: Deprecate FacetsCollector#search helper methods as they internally use IndexSearcher#search(Query, Collector) APIs [lucene]

2024-05-01 Thread via GitHub


zacharymorn opened a new pull request, #13334:
URL: https://github.com/apache/lucene/pull/13334

   This is a follow-up PR for https://github.com/apache/lucene/pull/12890, 
where we have agreement that FacetsCollector#search helper methods can be 
deprecated without replacement.
   
   Build success with `./gradlew clean; ./gradlew tidy; ./gradlew check 
-Pvalidation.git.failOnModified=false` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Convert more classes to record classes [lucene]

2024-05-01 Thread via GitHub


shubhamvishu commented on code in PR #13328:
URL: https://github.com/apache/lucene/pull/13328#discussion_r1586937485


##
lucene/core/src/java/org/apache/lucene/util/TermAndVector.java:
##
@@ -24,23 +24,7 @@
  *
  * @lucene.experimental
  */
-public class TermAndVector {
-
-  private final BytesRef term;
-  private final float[] vector;
-
-  public TermAndVector(BytesRef term, float[] vector) {
-this.term = term;
-this.vector = vector;
-  }
-
-  public BytesRef getTerm() {
-return this.term;
-  }
-
-  public float[] getVector() {
-return this.vector;
-  }
+public record TermAndVector(BytesRef term, float[] vector) {

Review Comment:
   Makes sense! I changed it back to a class in the new revision



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Convert more classes to record classes [lucene]

2024-05-01 Thread via GitHub


shubhamvishu commented on code in PR #13328:
URL: https://github.com/apache/lucene/pull/13328#discussion_r1586937857


##
lucene/core/src/java/org/apache/lucene/search/CollectionStatistics.java:
##
@@ -116,7 +105,8 @@ public CollectionStatistics(
*
* @return field's name, not {@code null}
*/
-  public final String field() {

Review Comment:
   Done! I also changed some more records I could find to be as leaner as 
possible.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Convert more classes to record classes [lucene]

2024-05-01 Thread via GitHub


shubhamvishu commented on code in PR #13328:
URL: https://github.com/apache/lucene/pull/13328#discussion_r1586937659


##
lucene/core/src/java/org/apache/lucene/util/packed/PackedInts.java:
##
@@ -171,14 +171,7 @@ public final float overheadRatio(int bitsPerValue) {
   }
 
   /** Simple class that holds a format and a number of bits per value. */
-  public static class FormatAndBits {
-public final Format format;
-public final int bitsPerValue;
-
-public FormatAndBits(Format format, int bitsPerValue) {
-  this.format = format;
-  this.bitsPerValue = bitsPerValue;
-}
+  public record FormatAndBits(Format format, int bitsPerValue) {
 
 @Override
 public String toString() {

Review Comment:
   Removed this now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Convert more classes to record classes [lucene]

2024-05-01 Thread via GitHub


shubhamvishu commented on PR #13328:
URL: https://github.com/apache/lucene/pull/13328#issuecomment-2089321284

   I addressed your comments in the new revision and all the checks are passing 
now after your fix. Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove unnecessary `AbstractKnnVectorQuery.exactSearch()` [lucene]

2024-05-01 Thread via GitHub


github-actions[bot] commented on PR #13143:
URL: https://github.com/apache/lucene/pull/13143#issuecomment-2089326240

   This PR has not had activity in the past 2 weeks, labeling it as stale. If 
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you 
for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Multi range traversal for numeric range aggregations [lucene]

2024-05-01 Thread via GitHub


jainankitk commented on issue #13335:
URL: https://github.com/apache/lucene/issues/13335#issuecomment-2089338809

   Looks related to #9814, but there are differences between the two. As 
discussed offline with @bowenlan and @rishabhmaurya, MultiRangeQuery only tells 
what matches with your multi ranges, not what matches with each of your multi 
ranges.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Convert more classes to record classes [lucene]

2024-05-01 Thread via GitHub


shubhamvishu commented on code in PR #13328:
URL: https://github.com/apache/lucene/pull/13328#discussion_r1586937485


##
lucene/core/src/java/org/apache/lucene/util/TermAndVector.java:
##
@@ -24,23 +24,7 @@
  *
  * @lucene.experimental
  */
-public class TermAndVector {
-
-  private final BytesRef term;
-  private final float[] vector;
-
-  public TermAndVector(BytesRef term, float[] vector) {
-this.term = term;
-this.vector = vector;
-  }
-
-  public BytesRef getTerm() {
-return this.term;
-  }
-
-  public float[] getVector() {
-return this.vector;
-  }
+public record TermAndVector(BytesRef term, float[] vector) {

Review Comment:
   Makes sense! I changed it back to class in the new revision



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Make Weight#scorerSupplier abstract, Weight#scorer final [lucene]

2024-05-01 Thread via GitHub


iamsanjay commented on code in PR #13319:
URL: https://github.com/apache/lucene/pull/13319#discussion_r1587080630


##
lucene/core/src/java/org/apache/lucene/document/BinaryRangeFieldRangeQuery.java:
##
@@ -136,7 +136,18 @@ public float matchCost() {
   }
 };
 
-return new ConstantScoreScorer(this, score(), scoreMode, iterator);
+final var scorer = new ConstantScoreScorer(this, score(), scoreMode, 
iterator);
+return new ScorerSupplier() {

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org