Re: [PR] Removed Scorer#getWeight [lucene]

2024-05-30 Thread via GitHub
iamsanjay commented on PR #13440: URL: https://github.com/apache/lucene/pull/13440#issuecomment-2141198364 @jpountz We are also removing weight from Subclasses of Scorer as well, right? Because I already removed from good amount of classes such as ConstantScoreScorer and few others. Just wa

Re: [PR] Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 [lucene]

2024-05-30 Thread via GitHub
uschindler commented on PR #13146: URL: https://github.com/apache/lucene/pull/13146#issuecomment-2140958658 See https://openjdk.org/jeps/471 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 [lucene]

2024-05-30 Thread via GitHub
uschindler commented on PR #13146: URL: https://github.com/apache/lucene/pull/13146#issuecomment-2140957329 > Thanks Uwe! I suppose I blame the JDK then :-) I question how bad the venerable ByteBufferIndexInput was to warrant removing it over the potential improvement value of better memory

Re: [PR] Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 [lucene]

2024-05-30 Thread via GitHub
dsmiley commented on PR #13146: URL: https://github.com/apache/lucene/pull/13146#issuecomment-2140930526 Thanks Uwe! I suppose I blame the JDK then :-) I question how bad the venerable ByteBufferIndexInput was to warrant removing it over the potential improvement value of better memory m

[I] Instrument IndexOrDocValuesQuery to report on its decisions [lucene]

2024-05-30 Thread via GitHub
stefanvodita opened a new issue, #13442: URL: https://github.com/apache/lucene/issues/13442 ### Description For Amazon Product Search, we use `IndexOrDocValuesQuery` and have changed it to take a listener type object that records and reports on the decision the query has made. We loo

Re: [PR] Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 [lucene]

2024-05-30 Thread via GitHub
uschindler commented on PR #13146: URL: https://github.com/apache/lucene/pull/13146#issuecomment-2140883305 See also discussion here: https://github.com/dacapobench/dacapobench/issues/264#issuecomment-2083056841 -- This is an automated message from the Apache Git Service. To respond to th

Re: [PR] Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 [lucene]

2024-05-30 Thread via GitHub
uschindler commented on PR #13146: URL: https://github.com/apache/lucene/pull/13146#issuecomment-2140873794 Changes entry is here: https://github.com/apache/lucene/blob/750a7c4d3b3e174023404bf363861dae31413901/lucene/CHANGES.txt#L87 It is Lucene only. -- This is an automated messag

Re: [PR] Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 [lucene]

2024-05-30 Thread via GitHub
uschindler commented on PR #13146: URL: https://github.com/apache/lucene/pull/13146#issuecomment-2140869268 Do you have highly concurrent close of index files? Die to the new features regarding safe close, the close is more expensive (especially for other threads concurrently accessing inde

Re: [PR] Add new test case "testGetLines" for lucene/core/analysis/WordlistLoader [lucene]

2024-05-30 Thread via GitHub
hack4chang commented on code in PR #13419: URL: https://github.com/apache/lucene/pull/13419#discussion_r1621365979 ## lucene/core/src/test/org/apache/lucene/analysis/TestWordlistLoader.java: ## @@ -77,4 +82,17 @@ public void testSnowballListLoading() throws IOException { as

Re: [I] What does the Lucene community think about dimensionality reduction for vectors, and should it be something the library does internally (at merge time perhaps)? [lucene]

2024-05-30 Thread via GitHub
mikemccand commented on issue #13403: URL: https://github.com/apache/lucene/issues/13403#issuecomment-2140339821 Maybe we could avoid quantization for small segments, and only once a merged segment is big enough we trigger the code book training? -- This is an automated message from the A

Re: [PR] Sparse index [lucene]

2024-05-30 Thread via GitHub
jpountz closed pull request #13441: Sparse index URL: https://github.com/apache/lucene/pull/13441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues

[PR] Sparse index [lucene]

2024-05-30 Thread via GitHub
jpountz opened a new pull request, #13441: URL: https://github.com/apache/lucene/pull/13441 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] Sparse index [lucene]

2024-05-30 Thread via GitHub
jpountz commented on PR #13441: URL: https://github.com/apache/lucene/pull/13441#issuecomment-2140264809 Wops sorry clicked a wrong butten in the UI. This is not ready yet. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] Removed Scorer#getWeight [lucene]

2024-05-30 Thread via GitHub
iamsanjay commented on PR #13440: URL: https://github.com/apache/lucene/pull/13440#issuecomment-2139855092 > Could you also look into removing `Weight` from the `Scorer` constructor? To me this is the most annoying bit. Yup I am on it! Total 49 usages as per quick IntelliJ search.

Re: [PR] Removed Scorer#getWeight [lucene]

2024-05-30 Thread via GitHub
jpountz commented on PR #13440: URL: https://github.com/apache/lucene/pull/13440#issuecomment-2139781784 Could you also look into removing `Weight` from the `Scorer` constructor? To me this is the most annoying bit. Regarding TestSubScorerFreqs, does using `ScorerIndexSearcher` rather

Re: [PR] Removed Scorer#getWeight [lucene]

2024-05-30 Thread via GitHub
iamsanjay commented on PR #13440: URL: https://github.com/apache/lucene/pull/13440#issuecomment-2139685645 There is one issue that I have faced. To get reference of weight instance in Collector instance, one can override below method. ``` default void setWeight(Weight weight) {} ```

[PR] Removed Scorer#getWeight [lucene]

2024-05-30 Thread via GitHub
iamsanjay opened a new pull request, #13440: URL: https://github.com/apache/lucene/pull/13440 ### Description Closes #13410 If Caller requires `Weight` then they have to keep track of Weight with which Scorer was created in the first place instead of relying on Scorer

Re: [PR] Fix test failure on TestPoint#testEqualsAndHashCode [lucene]

2024-05-30 Thread via GitHub
easyice commented on PR #13433: URL: https://github.com/apache/lucene/pull/13433#issuecomment-2139548520 Maybe we don't need to fix the other tests right now, unless they really encounter hash collisions. -- This is an automated message from the Apache Git Service. To respond to the messa

[PR] Avoid unnecessary memory allocation in PackedLongValues#Iterator [lucene]

2024-05-30 Thread via GitHub
easyice opened a new pull request, #13439: URL: https://github.com/apache/lucene/pull/13439 We always allocate a long array of page size for a new `PackedLongValues#Iterator` instance, which is not necessary when packing a small number of values. this is more evident in the scenario of high

Re: [PR] Rewrite newSlowRangeQuery to MatchNoDocsQuery when upper > lower [lucene]

2024-05-30 Thread via GitHub
jpountz commented on PR #13425: URL: https://github.com/apache/lucene/pull/13425#issuecomment-2139510268 @ioanatia Would you mind bumping this change to 9.12 since the 9.11 branch has been bumped in the meantime? Sorry for the inconvenience? The change looks good to me, I'll merge once this

Re: [PR] Rewrite newSlowRangeQuery to MatchNoDocsQuery when upper > lower [lucene]

2024-05-30 Thread via GitHub
ioanatia commented on PR #13425: URL: https://github.com/apache/lucene/pull/13425#issuecomment-2139507131 thank you @jpountz I added the changelog so this is ready for another review -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] Lexical error using lucene 9 but lucene 2 works [lucene]

2024-05-30 Thread via GitHub
mkhludnev closed issue #13437: Lexical error using lucene 9 but lucene 2 works URL: https://github.com/apache/lucene/issues/13437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Lexical error using lucene 9 but lucene 2 works [lucene]

2024-05-30 Thread via GitHub
mkhludnev commented on issue #13437: URL: https://github.com/apache/lucene/issues/13437#issuecomment-2139466355 slash is a special character (barely know why) which should be escaped via backslash https://lucene.apache.org/core/9_6_0/queryparser/org/apache/lucene/queryparser/classic/pac

Re: [I] Improve Lucene's I/O concurrency [lucene]

2024-05-30 Thread via GitHub
jpountz commented on issue #13179: URL: https://github.com/apache/lucene/issues/13179#issuecomment-2139074516 > If I understand correctly, the read ahead mechanism in IndexInput will be useful if matching docs fall within the read ahead size. Otherwise those will be wasted pages cached or d

Re: [PR] WIP - Add minimum number of segments to TieredMergePolicy [lucene]

2024-05-30 Thread via GitHub
jpountz commented on code in PR #13430: URL: https://github.com/apache/lucene/pull/13430#discussion_r1619456981 ## lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java: ## @@ -93,6 +93,7 @@ public class TieredMergePolicy extends MergePolicy { private double seg

Re: [I] How to speedup concurrent merge [lucene]

2024-05-30 Thread via GitHub
hanqiushi closed issue #13432: How to speedup concurrent merge URL: https://github.com/apache/lucene/issues/13432 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe

[PR] Compare to smallest and largest term before seeking target in SegmentTermsEnum.seekCeil. [lucene]

2024-05-30 Thread via GitHub
vsop-479 opened a new pull request, #13438: URL: https://github.com/apache/lucene/pull/13438 ### Description Similar to `SegmentTermsEnum.seekExact`, but we need to set `currentFrame` to `min/max` entry's block, otherwise we will fail when calling `SegmentTermsEnum.next`. --

Re: [I] Improve Lucene's I/O concurrency [lucene]

2024-05-30 Thread via GitHub
sohami commented on issue #13179: URL: https://github.com/apache/lucene/issues/13179#issuecomment-2138876791 > To avoid this per-doc overhead, I imagine that we would need to add some prefetch() API on (Numeric|SortedNumeric|Sorted|SortedSet|Binary)DocValues like @sohami suggests and requir