Re: [PR] Fix intermittently failing TestSortedSetFieldSource [lucene]

2023-11-30 Thread via GitHub
ChrisHegarty merged PR #12850: URL: https://github.com/apache/lucene/pull/12850 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] [9.x] Fix intermittently failing TestSortedSetFieldSource [lucene]

2023-11-30 Thread via GitHub
ChrisHegarty merged PR #12851: URL: https://github.com/apache/lucene/pull/12851 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Removing TermInSetQuery array ctor [lucene]

2023-11-30 Thread via GitHub
slow-J commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1833404097 > +1 to this simplification. Let's also get rid of `public TermInSetQuery(RewriteMethod rewriteMethod, String field, BytesRef... terms)`? Thanks Greg, I think I could have been more

Re: [PR] Initial impl of MMapDirectory for Java 22 [lucene]

2023-11-30 Thread via GitHub
uschindler commented on PR #12706: URL: https://github.com/apache/lucene/pull/12706#issuecomment-1833435917 After https://github.com/jdk/pull/16792 was fixed, I added the better isAlive check to the Java 21+ code. -- This is an automated message from the Apache Git Service. To respond to

[PR] Reuse BitSet when there are deleted documents in the index instead of creating new BitSet [lucene]

2023-11-30 Thread via GitHub
Pulkitg64 opened a new pull request, #12857: URL: https://github.com/apache/lucene/pull/12857 ### Description Fixes issue: #12414 Before this change we were creating new BitSet every time when there are deletions in the index with use of matched Docs and Live Docs. To create th

[I] TestPointQueries.testAllPointDocsWereDeletedAndThenMergedAgain reproducible test failure [lucene]

2023-11-30 Thread via GitHub
mikemccand opened a new issue, #12858: URL: https://github.com/apache/lucene/issues/12858 ### Description Happened in [this CI build](https://ci-builds.apache.org/job/Lucene/job/Lucene-Check-main/10750/) and also when I ran smoke tester for 9.9.0 RC 0. It repros for me: ```

Re: [PR] Fix bug in UnescapedCharSequence and add basic unit tests [lucene]

2023-11-30 Thread via GitHub
shubhamvishu commented on code in PR #12849: URL: https://github.com/apache/lucene/pull/12849#discussion_r1410534392 ## lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/core/util/UnescapedCharSequence.java: ## @@ -82,11 +71,11 @@ public String toString() {

Re: [PR] Avoid null PointValues when merging points in SlowCompositeCodecReaderWrapper [lucene]

2023-11-30 Thread via GitHub
ChrisHegarty commented on PR #12859: URL: https://github.com/apache/lucene/pull/12859#issuecomment-1833630547 Thanks @mikemccand. I've repo'ed this locally, by adding a `@Repeat(iterations = 1000)` to the test, then applied your patch. I see another variant of this, in `SortingPointValues

Re: [PR] Avoid null PointValues when merging points in SlowCompositeCodecReaderWrapper [lucene]

2023-11-30 Thread via GitHub
javanna commented on PR #12859: URL: https://github.com/apache/lucene/pull/12859#issuecomment-1833660889 This seems related to ghost fields, that were discussed and (partially) dealt with in #11393 . The specific scenario triggered by this failure did not surface until now , I believe. --

[PR] Add simple tool to diff entries in lucene's CHANGES.txt that should be identical [lucene]

2023-11-30 Thread via GitHub
mikemccand opened a new pull request, #12860: URL: https://github.com/apache/lucene/pull/12860 I had thought we had some tooling around this already but couldn't find it so I wrote it (again maybe!). It's a simple tool: you pass in two branches to compare. The branch can be a branch

Re: [PR] Add simple tool to diff entries in lucene's CHANGES.txt that should be identical [lucene]

2023-11-30 Thread via GitHub
ChrisHegarty commented on PR #12860: URL: https://github.com/apache/lucene/pull/12860#issuecomment-1833701434 Thanks @mikemccand - this is great. > I think these are mostly minor and @ChrisHegarty you can copy the 9.9.x CHANGES.txt entry for Lucene 9.9.0 over onto `main` so they are i

Re: [PR] Add simple tool to diff entries in lucene's CHANGES.txt that should be identical [lucene]

2023-11-30 Thread via GitHub
mikemccand commented on PR #12860: URL: https://github.com/apache/lucene/pull/12860#issuecomment-1833719237 > I can see the missing colon in the changes html, but it doesn't seems to cause any issue. Great! Thanks for checking @ChrisHegarty. -- This is an automated message from th

Re: [PR] Avoid null PointValues when merging points in SlowCompositeCodecReaderWrapper [lucene]

2023-11-30 Thread via GitHub
ChrisHegarty commented on PR #12859: URL: https://github.com/apache/lucene/pull/12859#issuecomment-1833719259 Thanks @javanna. Given this, I went ahead and applied the null check in SortingCodecReader too, which is consistent with other usages of `getValues`. -- This is an automated messa

Re: [PR] Reuse BitSet when there are deleted documents in the index instead of creating new BitSet [lucene]

2023-11-30 Thread via GitHub
shubhamvishu commented on code in PR #12857: URL: https://github.com/apache/lucene/pull/12857#discussion_r1410612335 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -118,13 +118,38 @@ private TopDocs getLeafResults(LeafReaderContext ctx, Weig

Re: [PR] Avoid null PointValues when merging points in SlowCompositeCodecReaderWrapper [lucene]

2023-11-30 Thread via GitHub
mikemccand commented on PR #12859: URL: https://github.com/apache/lucene/pull/12859#issuecomment-1833830715 Thanks all, I'll merge to 9.9.x (no need to respin) and port forward to 9.x and main. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Avoid null PointValues when merging points in SlowCompositeCodecReaderWrapper [lucene]

2023-11-30 Thread via GitHub
mikemccand merged PR #12859: URL: https://github.com/apache/lucene/pull/12859 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [I] TestPointQueries.testAllPointDocsWereDeletedAndThenMergedAgain reproducible test failure [lucene]

2023-11-30 Thread via GitHub
mikemccand commented on issue #12858: URL: https://github.com/apache/lucene/issues/12858#issuecomment-1833842319 Fixed with 00de0aef6348ab816e9594781aedbb92d266573e on main (and also separately on 9.9 and 9.x). -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] TestPointQueries.testAllPointDocsWereDeletedAndThenMergedAgain reproducible test failure [lucene]

2023-11-30 Thread via GitHub
mikemccand closed issue #12858: TestPointQueries.testAllPointDocsWereDeletedAndThenMergedAgain reproducible test failure URL: https://github.com/apache/lucene/issues/12858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Fix bug in UnescapedCharSequence and add basic unit tests [lucene]

2023-11-30 Thread via GitHub
slow-J commented on code in PR #12849: URL: https://github.com/apache/lucene/pull/12849#discussion_r1410738451 ## lucene/queryparser/src/java/org/apache/lucene/queryparser/flexible/core/util/UnescapedCharSequence.java: ## @@ -82,11 +71,11 @@ public String toString() { */

Re: [PR] Avoid null PointValues when merging points in SlowCompositeCodecReaderWrapper [lucene]

2023-11-30 Thread via GitHub
jpountz commented on code in PR #12859: URL: https://github.com/apache/lucene/pull/12859#discussion_r1410791257 ## lucene/core/src/java/org/apache/lucene/index/SlowCompositeCodecReaderWrapper.java: ## @@ -599,7 +599,11 @@ public PointValues getValues(String field) throws IOExce

Re: [PR] Reuse BitSet when there are deleted documents in the index instead of creating new BitSet [lucene]

2023-11-30 Thread via GitHub
Pulkitg64 commented on code in PR #12857: URL: https://github.com/apache/lucene/pull/12857#discussion_r1410805933 ## lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java: ## @@ -118,13 +118,38 @@ private TopDocs getLeafResults(LeafReaderContext ctx, Weight

Re: [PR] Fix bug in UnescapedCharSequence and add basic unit tests [lucene]

2023-11-30 Thread via GitHub
gsmiller commented on code in PR #12849: URL: https://github.com/apache/lucene/pull/12849#discussion_r1410847602 ## lucene/CHANGES.txt: ## @@ -138,6 +138,8 @@ Other * GITHUB#12239: Hunspell: reduced suggestion set dependency on the hash table order (Peter Gromov) +* GITHUB

Re: [PR] Fix bug in UnescapedCharSequence and add basic unit tests [lucene]

2023-11-30 Thread via GitHub
slow-J commented on code in PR #12849: URL: https://github.com/apache/lucene/pull/12849#discussion_r1410857455 ## lucene/CHANGES.txt: ## @@ -138,6 +138,8 @@ Other * GITHUB#12239: Hunspell: reduced suggestion set dependency on the hash table order (Peter Gromov) +* GITHUB#9

Re: [PR] Add static function in TaskExecutor to retrieve the results for a collection of Future [lucene]

2023-11-30 Thread via GitHub
shubhamvishu commented on PR #12798: URL: https://github.com/apache/lucene/pull/12798#issuecomment-1834080104 @javanna Hmm I agree with you the consumers don't really depend on TE to run their Futures and its more like a generic utility function being exposed by TE(which is better to av

[I] Reproducible error: TestTopFieldCollector#testSort() [lucene]

2023-11-30 Thread via GitHub
slow-J opened a new issue, #12861: URL: https://github.com/apache/lucene/issues/12861 ### Description Repro: `./gradlew test --tests TestTopFieldCollector.testSort -Dtests.seed=FB342F799D565015 -Dtests.multiplier=3 -Dtests.locale=fo-Latn-FO -Dtests.timezone=Asia/Kuala_Lumpur -Dtes

Re: [I] Reproducible error: TestTopFieldCollector#testSort() [lucene]

2023-11-30 Thread via GitHub
slow-J commented on issue #12861: URL: https://github.com/apache/lucene/issues/12861#issuecomment-1834117502 My bad, premature issue, this was fixed already https://github.com/apache/lucene/commit/8703b541a5048545026f68ea77d0080ed2d8e7ef -- This is an automated message from the Apache

Re: [I] Reproducible error: TestTopFieldCollector#testSort() [lucene]

2023-11-30 Thread via GitHub
slow-J closed issue #12861: Reproducible error: TestTopFieldCollector#testSort() URL: https://github.com/apache/lucene/issues/12861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [I] Reproducible error: TestTopFieldCollector#testSort() [lucene]

2023-11-30 Thread via GitHub
slow-J closed issue #12861: Reproducible error: TestTopFieldCollector#testSort() URL: https://github.com/apache/lucene/issues/12861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Removing TermInSetQuery varargs ctor [lucene]

2023-11-30 Thread via GitHub
gsmiller commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1834309861 @slow-J oh sorry if I missed it. I didn't see it when I looked at the diff, but maybe I was looking at an old commit somehow. Dunno. Either way, it looks good to me! How would you like

Re: [I] UnescapedCharSequence Bugs [LUCENE-8001] [lucene]

2023-11-30 Thread via GitHub
gsmiller closed issue #9049: UnescapedCharSequence Bugs [LUCENE-8001] URL: https://github.com/apache/lucene/issues/9049 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Fix bug in UnescapedCharSequence and add basic unit tests [lucene]

2023-11-30 Thread via GitHub
gsmiller merged PR #12849: URL: https://github.com/apache/lucene/pull/12849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] Removing TermInSetQuery varargs ctor [lucene]

2023-11-30 Thread via GitHub
slow-J commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1834349738 > @slow-J How would you like to handle the back port of this? We'll need to mark the public methods as deprecated on a 9.x release before we remove them on main. Normally, I'd just open a

Re: [PR] Fix bug in UnescapedCharSequence and add basic unit tests [lucene]

2023-11-30 Thread via GitHub
slow-J commented on PR #12849: URL: https://github.com/apache/lucene/pull/12849#issuecomment-1834353347 Thanks for the review @gsmiller and @shubhamvishu ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Removing TermInSetQuery varargs ctor [lucene]

2023-11-30 Thread via GitHub
gsmiller commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1834358456 I'm not actually sure the best way to handle this. I have another open change right now that would run into this same problem, so I'm interested in the solution too :) I'd sugges

[PR] Add Facets#getBulkSpecificValues method (#12180) [lucene]

2023-11-30 Thread via GitHub
epotyom opened a new pull request, #12862: URL: https://github.com/apache/lucene/pull/12862 Add `Facets#getBulkSpecificValues` method and implemented it in every class that implements `getSpecificValue`. Also moved FacetLabel class one level up from `org.apache.lucene.facet.taxonomy`

Re: [I] Add Facets#getSpecificValues (bulk) and bulk path -> ordinal lookup for taxonomy faceting [lucene]

2023-11-30 Thread via GitHub
epotyom commented on issue #12180: URL: https://github.com/apache/lucene/issues/12180#issuecomment-1834368168 @mikemccand , @ChrisHegarty , Sorry for confusion, but this issue is not fully done yet. This issue includes: 1. [done] Add `TaxonomyReader#getBulkOrdinals` method (#12769)

Re: [PR] Removing TermInSetQuery varargs ctor [lucene]

2023-11-30 Thread via GitHub
slow-J commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1834384051 > I'd suggest we ask on the dev@ list, but let's see what happens with 9.9 first? If 9.9 RC2 does _not_ go forward, we could try to get the deprecation back port into the 9.9 RC3. Assumin

Re: [PR] Removing TermInSetQuery varargs ctor [lucene]

2023-11-30 Thread via GitHub
slow-J commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1834391160 @gsmiller looks like we have a very relevant reply to your previous email! @jpountz replied: “My expectation is that we will do a 9.x minor at about the same time as 10.0 anyway, this

Re: [PR] Add static function in TaskExecutor to retrieve the results for a collection of Future [lucene]

2023-11-30 Thread via GitHub
javanna commented on PR #12798: URL: https://github.com/apache/lucene/pull/12798#issuecomment-1834417361 We do have some existing util classes but nothing that I recall around concurrency or futures handling. -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] LUCENE-10002: Deprecate IndexSearch#search(Query, Collector) in favor of IndexSearcher#search(Query, CollectorManager) - TopFieldCollectorManager & TopScoreDocCollectorManager [lucene]

2023-11-30 Thread via GitHub
javanna commented on PR #240: URL: https://github.com/apache/lucene/pull/240#issuecomment-1834420237 heya @zacharymorn given that this is a deprecation, I guess you meant on backporting to branch_9x as well? -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Removing TermInSetQuery varargs ctor [lucene]

2023-11-30 Thread via GitHub
slow-J commented on PR #12837: URL: https://github.com/apache/lucene/pull/12837#issuecomment-1834496470 Since the ctors are used internally and I can't just mark them as deprecated with no changes: My plan for backport is: * Keep this PR against main as is. * Do another PR again

Re: [PR] Introduce growInRange to reduce array overallocation [lucene]

2023-11-30 Thread via GitHub
stefanvodita commented on PR #12844: URL: https://github.com/apache/lucene/pull/12844#issuecomment-1834578138 I thought some more about option 2. It does seem quite tricky. `OnHeapHnswGraph` only knows about `NeighborArrays` being created, but it doesn't know about nodes being added to the

Re: [I] Move Points from a visitor API to a cursor-style API? [LUCENE-9619] [lucene]

2023-11-30 Thread via GitHub
jpountz commented on issue #10659: URL: https://github.com/apache/lucene/issues/10659#issuecomment-1834587775 This has been implemented. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] Move Points from a visitor API to a cursor-style API? [LUCENE-9619] [lucene]

2023-11-30 Thread via GitHub
jpountz closed issue #10659: Move Points from a visitor API to a cursor-style API? [LUCENE-9619] URL: https://github.com/apache/lucene/issues/10659 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Add support for similarity-based vector searches [lucene]

2023-11-30 Thread via GitHub
kaivalnp commented on PR #12679: URL: https://github.com/apache/lucene/pull/12679#issuecomment-1834591675 Thanks @benwtrent! I also simplified the queries: I realized that the API may be difficult to use in the current state (we are leaving two parameters - `traversalSimilarity` and `

Re: [PR] Add support for similarity-based vector searches [lucene]

2023-11-30 Thread via GitHub
kaivalnp commented on code in PR #12679: URL: https://github.com/apache/lucene/pull/12679#discussion_r1411286512 ## lucene/core/src/java/org/apache/lucene/search/AbstractRnnVectorQuery.java: ## Review Comment: Added now -- This is an automated message from the Apache Gi

Re: [PR] Add support for similarity-based vector searches [lucene]

2023-11-30 Thread via GitHub
kaivalnp commented on code in PR #12679: URL: https://github.com/apache/lucene/pull/12679#discussion_r1411285972 ## lucene/core/src/java/org/apache/lucene/search/AbstractVectorSimilarityQuery.java: ## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software Foundation (ASF) und

Re: [PR] LUCENE-10002: Deprecate IndexSearch#search(Query, Collector) in favor of IndexSearcher#search(Query, CollectorManager) - TopFieldCollectorManager & TopScoreDocCollectorManager [lucene]

2023-11-30 Thread via GitHub
zacharymorn commented on PR #240: URL: https://github.com/apache/lucene/pull/240#issuecomment-1834640695 Hi @javanna , I was actually thinking to have it for `10.0.0` (added an entry into that section in `CHANGES.txt`), as deprecating `IndexSearcher#search(Query, Collector)` has a rather la

Re: [PR] Introduce growInRange to reduce array overallocation [lucene]

2023-11-30 Thread via GitHub
zhaih commented on PR #12844: URL: https://github.com/apache/lucene/pull/12844#issuecomment-1834745287 Yeah I don't think we need to spend too much effort on option2, especially in this PR, because providing an upper bound on memory usage is good enough to me. Plus we'll never be able t

Re: [PR] Add ParentJoin KNN support [lucene]

2023-11-30 Thread via GitHub
david-sitsky commented on PR #12434: URL: https://github.com/apache/lucene/pull/12434#issuecomment-1835237536 @benwtrent - did this really make it into 9.8.0? I downloaded the 9.8.0 release and ToParentBlockJoinFloatKnnVectorQuery does not seem to be present. ``` lucene-9.8.0/modules$

[I] Jvm Crashes occassionaly with Lucene 8.10.0, JDK 11.0.15+10 [lucene]

2023-11-30 Thread via GitHub
sosohu opened a new issue, #12863: URL: https://github.com/apache/lucene/issues/12863 ### Description I used the Lucene 8.10.0 inside my project, and we found a JVM crash with the jstack as below: ``` "main" #1 prio=5 tid=0x7f780c042000 nid=0x3503 runnable [0x00