Re: [PR] Remove CollectorOwner class (#13671) [lucene]

2024-09-04 Thread via GitHub
epotyom commented on code in PR #13702: URL: https://github.com/apache/lucene/pull/13702#discussion_r1743335144 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSideways.java: ## @@ -414,62 +399,59 @@ public ConcurrentDrillSidewaysResult search( /** * Search using

Re: [I] TestBoolean2.testRandomQueries fails in CI due to eating up heap space [lucene]

2024-09-04 Thread via GitHub
javanna commented on issue #11754: URL: https://github.com/apache/lucene/issues/11754#issuecomment-2328396050 You are correct @rmuir ! In my case, `mulFactor` is `1024`. My branch makes it even more likely to go out of heap space, because it may create more slices to be searched conc

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1743463027 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -890,11 +945,70 @@ public static class LeafSlice { * * @lucene.experimental

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
javanna commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2328473802 This is ready for a final review. I have no further outstanding items to look into. I added some entries to the migrate file, changes entry and updated description above to be aligned wi

Re: [I] TestBoolean2.testRandomQueries fails in CI due to eating up heap space [lucene]

2024-09-04 Thread via GitHub
rmuir commented on issue #11754: URL: https://github.com/apache/lucene/issues/11754#issuecomment-2328792680 @javanna I don't understand the test, but we need to tone down its craziness. Removing 1000 sounds great to me, if the single line change makes the test a thousand times less annoying

Re: [I] TestBoolean2.testRandomQueries fails in CI due to eating up heap space [lucene]

2024-09-04 Thread via GitHub
javanna commented on issue #11754: URL: https://github.com/apache/lucene/issues/11754#issuecomment-2328812477 Agreed, I opened #13713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] RegExp::toAutomaton no longer minimizes [lucene]

2024-09-04 Thread via GitHub
mikemccand commented on issue #13706: URL: https://github.com/apache/lucene/issues/13706#issuecomment-2328841283 > I think the relaxation patch is fine as a short first step - it doesn't claim to be optimal (PnP, as Mike loves to say). +1, heh. -- This is an automated message from

Re: [I] RegExp::toAutomaton no longer minimizes [lucene]

2024-09-04 Thread via GitHub
rmuir commented on issue #13706: URL: https://github.com/apache/lucene/issues/13706#issuecomment-2329167160 > Is `subsetOf` really quadratic? I know its javadoc says so, but I'm not convinced. I think this is only one of the scarier parts about it. The other scary part is that it may

Re: [PR] Reduce size of queue in TestBoolean2#testRandomQueries [lucene]

2024-09-04 Thread via GitHub
javanna merged PR #13713: URL: https://github.com/apache/lucene/pull/13713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] TestBoolean2.testRandomQueries fails in CI due to eating up heap space [lucene]

2024-09-04 Thread via GitHub
javanna closed issue #11754: TestBoolean2.testRandomQueries fails in CI due to eating up heap space URL: https://github.com/apache/lucene/issues/11754 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
mikemccand commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1743876694 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true;

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
rmuir commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1743877792 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true; }

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
mikemccand commented on code in PR #13707: URL: https://github.com/apache/lucene/pull/13707#discussion_r1743879304 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -857,22 +857,38 @@ public static boolean isEmpty(Automaton a) { return true;

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
stefanvodita commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2329194920 Do we have any benchmark results? I'm curious to see what's gotten faster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
javanna commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2329216197 > Do we have any benchmark results? I'm curious to see what's gotten faster. No. I focused on functionality. I wonder if it makes sense to run benchmarks before we address the dupl

[PR] Make Operations#repeat create simpler automata. [lucene]

2024-09-04 Thread via GitHub
jpountz opened a new pull request, #13714: URL: https://github.com/apache/lucene/pull/13714 `Operations#repeat` currently creates an automaton that has one more final state, and 2x more transitions for each of the final states. With this change, the returned automaton has a single final sta

Re: [PR] Make Operations#repeat create simpler automata. [lucene]

2024-09-04 Thread via GitHub
jpountz commented on PR #13714: URL: https://github.com/apache/lucene/pull/13714#issuecomment-2329331978 I just pushed a commit that avoids creating dead states if the input automaton doesn't have dead states itself. So the automaton created by `.*` would now have a single state. For

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
javanna commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2329360505 > I think it's important at least to ensure that we "do no harm" to the existing use cases. Intra-segment is not enabled by default and requires additional work. The scope of this

Re: [PR] Make Operations#repeat create simpler automata. [lucene]

2024-09-04 Thread via GitHub
mikemccand commented on code in PR #13714: URL: https://github.com/apache/lucene/pull/13714#discussion_r1743958059 ## lucene/core/src/java/org/apache/lucene/util/automaton/Operations.java: ## @@ -182,26 +182,44 @@ public static Automaton repeat(Automaton a) { // Repeating

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
javanna commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2329387884 Perhaps one thing to add is that the main focus/direction of this change is to set the foundations to be able to finally decouple the index geometry from how the search is run and can be

Re: [PR] Make Operations#repeat create simpler automata. [lucene]

2024-09-04 Thread via GitHub
jpountz commented on PR #13714: URL: https://github.com/apache/lucene/pull/13714#issuecomment-2329450444 Good ideas, I'll look into it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Relax Operations.isTotal() to work with a deterministic automaton [lucene]

2024-09-04 Thread via GitHub
rmuir commented on PR #13707: URL: https://github.com/apache/lucene/pull/13707#issuecomment-2329472492 I updated the javadocs: "The automaton must be deterministic, or this method may return false." Previous implementation was: "The automaton must be minimal, or this method may return fa

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-04 Thread via GitHub
msokolov commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2329642279 > I don't entirely follow how we may have painted ourselves in a corner and how running benchmarks relates to this. They're not related; perhaps I should have broken my comment i

[PR] Backport of #13702 [lucene]

2024-09-04 Thread via GitHub
gsmiller opened a new pull request, #13716: URL: https://github.com/apache/lucene/pull/13716 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[I] `count()` in LRUQueryCache [lucene]

2024-09-04 Thread via GitHub
LuXugang opened a new issue, #13717: URL: https://github.com/apache/lucene/issues/13717 ### Description Currently, the implementation of `Weight.count()` in `LRUQueryCache` first considers whether the wrapped `Weight` can obtain the `count`, and then considers the cache. However,

Re: [PR] Remove CollectorOwner class (#13671) [lucene]

2024-09-04 Thread via GitHub
gsmiller commented on PR #13702: URL: https://github.com/apache/lucene/pull/13702#issuecomment-2329679349 No problem @javanna ! @epotyom - I got this back-ported onto 9.x but would appreciate a retro-active review if you have time (#13716). As before, I had to do some work to `DrillS

Re: [PR] Add dynamic range facets [lucene]

2024-09-04 Thread via GitHub
stefanvodita commented on PR #13689: URL: https://github.com/apache/lucene/pull/13689#issuecomment-2329719474 It would be great to merge this in time for 9.12. Since it's completely new code and marked experimental, I don't expect it to be controversial, but I'll wait a bit longer in case a

Re: [I] CompiledAutomaton not equal when NFARunAutomaton is non-null [lucene]

2024-09-04 Thread via GitHub
ChrisHegarty commented on issue #13715: URL: https://github.com/apache/lucene/issues/13715#issuecomment-2329846860 Before considering possible fixes, do we agree that there is a problem worth fixing? I’m particular thinking of the equality of IntervalQuery. -- This is an automated message

Re: [I] CompiledAutomaton not equal when NFARunAutomaton is non-null [lucene]

2024-09-04 Thread via GitHub
zhaih commented on issue #13715: URL: https://github.com/apache/lucene/issues/13715#issuecomment-2329914096 I think it's a nice to have one. Altho for the specific IntervalQuery I feel like maybe we can directly claim they're equal if the pattern is the same and not checking the automaton a

Re: [I] CompiledAutomaton not equal when NFARunAutomaton is non-null [lucene]

2024-09-04 Thread via GitHub
rmuir commented on issue #13715: URL: https://github.com/apache/lucene/issues/13715#issuecomment-2329919715 @ChrisHegarty I think it might be an oversight IntervalQuery is getting an NFA, was that really intended? The following other methods on intervals are powered by automatons and

Re: [I] CompiledAutomaton not equal when NFARunAutomaton is non-null [lucene]

2024-09-04 Thread via GitHub
rmuir commented on issue #13715: URL: https://github.com/apache/lucene/issues/13715#issuecomment-2329956146 There are other related problems to fix here separately, just start with CompiledAutomaton: * `hashCode()` is inconsistent with `equals()`: either they both consider `nfaRunAutoma

Re: [I] CompiledAutomaton not equal when NFARunAutomaton is non-null [lucene]

2024-09-04 Thread via GitHub
rmuir commented on issue #13715: URL: https://github.com/apache/lucene/issues/13715#issuecomment-2330179575 That's usually how it works, it is good stuff. To be practical, for now, I'd recommend just fixing Intervals.regex to `determinize()` so that it matches all other queries which are us