[GitHub] [lucene] iverase commented on pull request #12460: Allow reading binary doc values as a DataInput

2023-09-26 Thread via GitHub
iverase commented on PR #12460: URL: https://github.com/apache/lucene/pull/12460#issuecomment-1735063044 I have been thinking a bit longer about this and I think this approach of `DataInput` is not right. Instead we should try to return an API more similar to `RandomAccessInput` as Uwe sugg

[GitHub] [lucene] gf2121 opened a new pull request, #12591: Sort update terms with stable radix sorter

2023-09-26 Thread via GitHub
gf2121 opened a new pull request, #12591: URL: https://github.com/apache/lucene/pull/12591 Inspired by the #91, This PR proposes to use a stable radix sorter to sort update terms instead of tie comparator to doc id. As terms are appended in order, the latest update of each term value should

[GitHub] [lucene] iverase opened a new issue, #12592: Add length method to RandomAccessInput

2023-09-26 Thread via GitHub
iverase opened a new issue, #12592: URL: https://github.com/apache/lucene/issues/12592 It is currently not possible to read all bytes from a RandomAccessInput without previous knowledge of how many bytes were written. I would like to propose that RandomAccessInput can provide the user the i

[GitHub] [lucene] jpountz opened a new pull request, #12593: Compute better windows in MaxScoreBulkScorer.

2023-09-26 Thread via GitHub
jpountz opened a new pull request, #12593: URL: https://github.com/apache/lucene/pull/12593 MaxScoreBulkScorer computes windows based on the set of clauses that were essential in the *previous* window. This usually works well as the set of essential clauses tends to be stable over time, but

[GitHub] [lucene] jpountz commented on pull request #12593: Compute better windows in MaxScoreBulkScorer.

2023-09-26 Thread via GitHub
jpountz commented on PR #12593: URL: https://github.com/apache/lucene/pull/12593#issuecomment-1735586604 No impact on wikibigall, which is expected as all queries have sets of essential/non-essential clauses which are stable over time. ``` TaskQPS baseli

[GitHub] [lucene] jpountz commented on pull request #12564: Window-at-a-time scoring for conjunctions.

2023-09-26 Thread via GitHub
jpountz commented on PR #12564: URL: https://github.com/apache/lucene/pull/12564#issuecomment-1735610903 Closing: benchmarks on wikibigall don't suggest it helps more than #12382. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [lucene] jpountz closed pull request #12564: Window-at-a-time scoring for conjunctions.

2023-09-26 Thread via GitHub
jpountz closed pull request #12564: Window-at-a-time scoring for conjunctions. URL: https://github.com/apache/lucene/pull/12564 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] s1monw opened a new pull request, #12595: Make IndexWriter#flushNextBuffer also apply deletes if necessary

2023-09-26 Thread via GitHub
s1monw opened a new pull request, #12595: URL: https://github.com/apache/lucene/pull/12595 `IndexWriter#flushNextBuffer()` is a convenient way to control indexing buffer sizes across multiple index writers. This change also flushes deletes if necessary when `#flushNextBuffer()` is called ev

[GitHub] [lucene] heemin32 opened a new issue, #12596: surpriseMePolygon and createRegularPolygon in test util class returns invalid polygon

2023-09-26 Thread via GitHub
heemin32 opened a new issue, #12596: URL: https://github.com/apache/lucene/issues/12596 ### Description https://github.com/apache/lucene/blob/main/lucene/test-framework/src/java/org/apache/lucene/tests/geo/ShapeTestUtil.java surpriseMePolygon and createRegularPolygon returns in

[GitHub] [lucene] jpountz commented on pull request #12595: Make IndexWriter#flushNextBuffer also apply deletes if necessary

2023-09-26 Thread via GitHub
jpountz commented on PR #12595: URL: https://github.com/apache/lucene/pull/12595#issuecomment-1736121993 Thinking out loud: it looks like your change always flushes both the largest pending writer and deletes. I wonder if we should try to make it more granular and e.g. check whichever of de

[GitHub] [lucene] sgup432 opened a new issue, #12597: Make IndexReader.CacheKey serializable

2023-09-26 Thread via GitHub
sgup432 opened a new issue, #12597: URL: https://github.com/apache/lucene/issues/12597 ### Description As of now, lucene LRU query cache and other OpenSearch caches uses CacheKey as a primary key for their caches as it helps to determine any changes during segment merges etc. We use

[GitHub] [lucene] shubhamvishu commented on issue #12394: Add the ability to compute vector similarity scores with the new ValuesSource API

2023-09-26 Thread via GitHub
shubhamvishu commented on issue #12394: URL: https://github.com/apache/lucene/issues/12394#issuecomment-1736155862 @jpountz I have raised a PR #12548 that adds the required APIs to DVS for computing vector similarity scores. Thanks! -- This is an automated message from the Apache Git Serv

[GitHub] [lucene] iverase commented on pull request #12594: Add length method to RandomAccessInput

2023-09-26 Thread via GitHub
iverase commented on PR #12594: URL: https://github.com/apache/lucene/pull/12594#issuecomment-1736205488 Thanks @jpountz! What it is not clear to me is if adding a method to this interface is considered a breaking change and it can only be introduced in a major release, or if it can be back

[GitHub] [lucene] gsmiller commented on issue #12558: IntTaxonomyFacets chooses dense values array when FacetsCollector has no MatchingDocs

2023-09-26 Thread via GitHub
gsmiller commented on issue #12558: URL: https://github.com/apache/lucene/issues/12558#issuecomment-1736274592 Oh interesting. Thanks @Shradha26! Is drill-sideways the only means for reproducing this (that we're aware of anyway)? That's tricky. Do we think it's doing the right thing, or sho

[GitHub] [lucene] gf2121 commented on a diff in pull request #12573: Use radix sort to speed up the sorting of deleted terms

2023-09-26 Thread via GitHub
gf2121 commented on code in PR #12573: URL: https://github.com/apache/lucene/pull/12573#discussion_r1338004334 ## lucene/core/src/java/org/apache/lucene/index/BufferedUpdates.java: ## @@ -139,15 +131,11 @@ public void addTerm(Term term, int docIDUpto) { return; } -