Re: [PR] Remove IndexSearcher#search(List, Weight, Collector) [lucene]

2024-09-13 Thread via GitHub
javanna merged PR #13780: URL: https://github.com/apache/lucene/pull/13780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
uschindler commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348202070 Backport or not? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
jpountz commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348207344 I would backport as it looks pretty safe, though I doubt we have anyone using this postings format, so it likely doesn't matter much in practice. -- This is an automated message from t

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
vsop-479 commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348218644 > can you merge main back into your branch? Merged. > can we find other occurrences using some regex searches? I searched code with `final int targetUptoMid = targetU

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
uschindler commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348293165 > I would backport as it looks pretty safe, though I doubt we have anyone using this postings format, so it likely doesn't matter much in practice. The we should move the change

Re: [PR] similarities: provide default computeNorm implementation; remove remaining discountOverlaps setters; [lucene]

2024-09-13 Thread via GitHub
cpoerschke merged PR #13757: URL: https://github.com/apache/lucene/pull/13757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
vsop-479 commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348387100 > The we should move the changes entry. Do we have a corresponding 9.x section already? We already have a similar change entry for https://github.com/apache/lucene/pull/13252 und

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
mikemccand commented on code in PR #13723: URL: https://github.com/apache/lucene/pull/13723#discussion_r1758531891 ## lucene/test-framework/src/test/org/apache/lucene/tests/util/TestFloatingPointComparisons.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
mikemccand commented on PR #13723: URL: https://github.com/apache/lucene/pull/13723#issuecomment-2348502055 > > I don't like the last commit because it changes from a assert-like method to a boolean returning method. > > I changed it away from an assertion because I liked this more. I

[I] The "PatternCaptureGroupTokenFilter" generates identical offsets, which causes issues with highlighting the string. [lucene]

2024-09-13 Thread via GitHub
shikhasharma3708 opened a new issue, #13783: URL: https://github.com/apache/lucene/issues/13783 ### Description I am implementing the PatternCaptureGroupTokenFilter in my code to generate tokens based on multiple regular expressions, with the goal of highlighting any matches found wi

[PR] Change docValuesSkipIndex from a boolean to an enum. [lucene]

2024-09-13 Thread via GitHub
jpountz opened a new pull request, #13784: URL: https://github.com/apache/lucene/pull/13784 At the moment, our skip indexes record min/max ordinal/value per range of doc IDs. It would be natural to extend it to other pre-aggregated data such as a sum and value count, which facets could take

Re: [PR] First-class random access API for KnnVectorValues [lucene]

2024-09-13 Thread via GitHub
jpountz commented on PR #13779: URL: https://github.com/apache/lucene/pull/13779#issuecomment-2348633015 > think you had said 9/22 would be a feature freeze date I was thinking of doing it next week, but we can backport this PR even though the branch has been cut if it looks ready/sa

Re: [PR] First-class random access API for KnnVectorValues [lucene]

2024-09-13 Thread via GitHub
jpountz commented on code in PR #13779: URL: https://github.com/apache/lucene/pull/13779#discussion_r1758641965 ## lucene/core/src/java/org/apache/lucene/index/KnnVectorValues.java: ## @@ -0,0 +1,281 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more +

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
uschindler commented on PR #13723: URL: https://github.com/apache/lucene/pull/13723#issuecomment-2348681854 > > > I don't like the last commit because it changes from a assert-like method to a boolean returning method. > > > > > > I changed it away from an assertion because I like

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
uschindler commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348768324 > > The we should move the changes entry. Do we have a corresponding 9.x section already? > > We already have a similar change entry for #13252 under 9.11.0, so we should move

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
vsop-479 commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2348774168 Ok, I will do it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
stefanvodita commented on PR #13723: URL: https://github.com/apache/lucene/pull/13723#issuecomment-2349071396 I've moved the methods around and, as I was writing more tests, realised I'm not going to be as comprehensive as the originals tests, so I adapted those instead. -- This is an au

Re: [PR] Replace Map with IntObjectHashMap for KnnVectorsReader [lucene]

2024-09-13 Thread via GitHub
bugmakerr commented on PR #13763: URL: https://github.com/apache/lucene/pull/13763#issuecomment-2349265068 @benwtrent @jpountz I have merged main branch into this one, can I get a review on this? -- This is an automated message from the Apache Git Service. To respond to the message, p

[PR] Fix Flaky Test In TestBlockJoinBulkScorer [lucene]

2024-09-13 Thread via GitHub
Mikep86 opened a new pull request, #13785: URL: https://github.com/apache/lucene/pull/13785 Fix the `testSetMinCompetitiveScoreWithScoreModeMax` test, which sometimes failed due to randomizations in how the docs were scored -- This is an automated message from the Apache Git Service.

Re: [PR] First-class random access API for KnnVectorValues [lucene]

2024-09-13 Thread via GitHub
ChrisHegarty commented on code in PR #13779: URL: https://github.com/apache/lucene/pull/13779#discussion_r1759099771 ## lucene/core/src/java/org/apache/lucene/codecs/lucene95/HasIndexSlice.java: ## @@ -14,23 +14,16 @@ * See the License for the specific language governing permi

Re: [PR] Use range optimizations for "slow" MultiTermQueries when terms happen to be contiguous [lucene]

2024-09-13 Thread via GitHub
iverase commented on PR #13693: URL: https://github.com/apache/lucene/pull/13693#issuecomment-2349297359 I have my doubts in this approach and I am unsure we should expose the scorer that way. On the other hand, I wonder if we can rewrite the query to a range query. I understand that in thi

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
uschindler commented on code in PR #13723: URL: https://github.com/apache/lucene/pull/13723#discussion_r1759153816 ## lucene/CHANGES.txt: ## @@ -422,7 +422,9 @@ Build Other -(No changes) + +* GITHUB#13720: Add float comparison based on unit of least prec

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
uschindler commented on code in PR #13723: URL: https://github.com/apache/lucene/pull/13723#discussion_r1759155851 ## lucene/test-framework/src/java/org/apache/lucene/tests/util/LuceneTestCase.java: ## @@ -864,6 +864,14 @@ public static void assumeNoException(String msg, Excepti

Re: [I] Can we remove `compress` option for quantized KNN vector indexing? [lucene]

2024-09-13 Thread via GitHub
mikemccand commented on issue #13768: URL: https://github.com/apache/lucene/issues/13768#issuecomment-2349362383 >> But I don't think we should block removing compress option due to non-SIMD results? Actually, thinking about this more ... I'm changing my mind. I don't fully understa

Re: [PR] New JMH benchmark method - vdot8s that implement int8 dotProduct in C… [lucene]

2024-09-13 Thread via GitHub
ChrisHegarty commented on PR #13572: URL: https://github.com/apache/lucene/pull/13572#issuecomment-2349442717 > > Anyways: At moment we do not want to have native code in Lucene Core. > .. > Having the likes of OpenSearch, Elasticsearch, and Solr implement their own (high performance)

Re: [PR] Add unit-of-least-precision float comparison [lucene]

2024-09-13 Thread via GitHub
stefanvodita commented on code in PR #13723: URL: https://github.com/apache/lucene/pull/13723#discussion_r1759505470 ## lucene/CHANGES.txt: ## @@ -422,7 +422,9 @@ Build Other -(No changes) + +* GITHUB#13720: Add float comparison based on unit of least pr

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
vsop-479 commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2350779495 > let's merge them an list both PRs. I merged them into one entry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[PR] Remove recurse into sub block when scan leaf block in IDVersionSegmentTermsEnumFrame. [lucene]

2024-09-13 Thread via GitHub
vsop-479 opened a new pull request, #13786: URL: https://github.com/apache/lucene/pull/13786 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Remove usage of IndexSearcher#search(Query, Collector) from join package [lucene]

2024-09-13 Thread via GitHub
msfroh commented on code in PR #13747: URL: https://github.com/apache/lucene/pull/13747#discussion_r1759648681 ## lucene/join/src/java/org/apache/lucene/search/join/MergeableCollector.java: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

Re: [PR] Use Arrays.compareUnsigned in IDVersionSegmentTermsEnum and OrdsSegmentTermsEnum. [lucene]

2024-09-13 Thread via GitHub
vsop-479 commented on PR #13782: URL: https://github.com/apache/lucene/pull/13782#issuecomment-2350800984 > can we find other occurrences using some regex searches? Hmm, I also find some suffix's loop comparing in `IDVersionSegmentTermsEnumFrame` and `OrdsSegmentTermsEnumFrame`, simil

[PR] Fix comment on compare suffix and target. [lucene]

2024-09-13 Thread via GitHub
vsop-479 opened a new pull request, #13787: URL: https://github.com/apache/lucene/pull/13787 ### Description Since we do not loop over bytes (hand-written) in the suffix, to comaring to the target anymore. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Remove usage of IndexSearcher#search(Query, Collector) from join package [lucene]

2024-09-13 Thread via GitHub
msfroh commented on code in PR #13747: URL: https://github.com/apache/lucene/pull/13747#discussion_r1759668285 ## lucene/join/src/java/org/apache/lucene/search/join/MergeableCollector.java: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

[PR] Reduce number of calculations in FSTCompiler [lucene]

2024-09-13 Thread via GitHub
mrhbj opened a new pull request, #13788: URL: https://github.com/apache/lucene/pull/13788 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns