Re: [PR] Remove CollectorOwner class (#13671) [lucene]

2024-09-03 Thread via GitHub
gsmiller commented on PR #13702: URL: https://github.com/apache/lucene/pull/13702#issuecomment-2327689510 Got this merged on main but will work on the back port in the morning (need to work through some conflicts and running out of time today). -- This is an automated message from the Apa

Re: [PR] Remove CollectorOwner class (#13671) [lucene]

2024-09-03 Thread via GitHub
gsmiller merged PR #13702: URL: https://github.com/apache/lucene/pull/13702 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

Re: [PR] Add Facets#getBulkSpecificValues method [lucene]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #12862: URL: https://github.com/apache/lucene/pull/12862#issuecomment-2327675527 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] [WIP] Multi-Vector support for HNSW search [lucene]

2024-09-03 Thread via GitHub
github-actions[bot] commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2327674839 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Remove CollectorOwner class (#13671) [lucene]

2024-09-03 Thread via GitHub
gsmiller commented on code in PR #13702: URL: https://github.com/apache/lucene/pull/13702#discussion_r1742821648 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSideways.java: ## @@ -414,62 +399,59 @@ public ConcurrentDrillSidewaysResult search( /** * Search usin

Re: [PR] Remove CollectorOwner class (#13671) [lucene]

2024-09-03 Thread via GitHub
gsmiller commented on code in PR #13702: URL: https://github.com/apache/lucene/pull/13702#discussion_r1742791755 ## lucene/facet/src/java/org/apache/lucene/facet/DrillSideways.java: ## @@ -480,59 +462,61 @@ private void searchSequentially( } Query[] drillDownQueries =

Re: [PR] Move anonymous Weight implementation in PointRangeQuery to named class [lucene]

2024-09-03 Thread via GitHub
jpountz commented on PR #13711: URL: https://github.com/apache/lucene/pull/13711#issuecomment-2327425891 Sorry, I don't think we should make Lucene's `Weight` implementations public. I looked up the OpenSearch issue, if I understand correctly, the problem you're trying to solve is tha

Re: [I] TestBoolean2.testRandomQueries fails in CI due to eating up heap space [lucene]

2024-09-03 Thread via GitHub
rmuir commented on issue #11754: URL: https://github.com/apache/lucene/issues/11754#issuecomment-2327418150 @javanna if you can reproduce it again, can you try `System.out.println(mulFactor)` at the end of the BeforeClass method? OOM in the stacktrace comes from creating priorityqueue

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-09-03 Thread via GitHub
john-wagster commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1742666476 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/BinarizedByteVectorValues.java: ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Add Bulk Scorer For ToParentBlockJoinQuery [lucene]

2024-09-03 Thread via GitHub
jpountz commented on code in PR #13697: URL: https://github.com/apache/lucene/pull/13697#discussion_r1742643399 ## lucene/join/src/java/org/apache/lucene/search/join/ToParentBlockJoinQuery.java: ## @@ -440,6 +477,99 @@ private String formatScoreExplanation(int matches, int star

Re: [PR] move Operations.sameLanguage/subsetOf to AutomatonTestUtil in test-framework [lucene]

2024-09-03 Thread via GitHub
dweiss commented on code in PR #13708: URL: https://github.com/apache/lucene/pull/13708#discussion_r1742548065 ## lucene/core/src/java/org/apache/lucene/util/automaton/StatePair.java: ## @@ -35,9 +35,14 @@ * @lucene.experimental */ public class StatePair { + // only mike k

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-09-03 Thread via GitHub
benwtrent commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1742505038 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/BinarizedByteVectorValues.java: ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-09-03 Thread via GitHub
benwtrent commented on code in PR #13651: URL: https://github.com/apache/lucene/pull/13651#discussion_r1742505038 ## lucene/core/src/java/org/apache/lucene/codecs/lucene912/BinarizedByteVectorValues.java: ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (AS

Re: [PR] Move anonymous Weight implementation in PointRangeQuery to named class [lucene]

2024-09-03 Thread via GitHub
jainankitk commented on PR #13711: URL: https://github.com/apache/lucene/pull/13711#issuecomment-2327098073 > I see from the linked issue that you would like to extend `PointRangeQuery`, but in general we don't like to think of our queries as being extensible. I wonder if you could do what

Re: [PR] Speed up advancing within a block. [lucene]

2024-09-03 Thread via GitHub
gsmiller commented on PR #13692: URL: https://github.com/apache/lucene/pull/13692#issuecomment-2326975500 Interesting. I'm not quite clear on what the difference is in the analysis you posted [here](https://github.com/apache/lucene/pull/13692#issuecomment-2324897665) vs. what got merged?

Re: [PR] Move anonymous Weight implementation in PointRangeQuery to named class [lucene]

2024-09-03 Thread via GitHub
jpountz commented on PR #13711: URL: https://github.com/apache/lucene/pull/13711#issuecomment-2326913351 I see from the linked issue that you would like to extend `PointRangeQuery`, but in general we don't like to think of our queries as being extensible. I wonder if you could do what you n

[PR] Move anonymous Weight implementation in PointRangeQuery to named class [lucene]

2024-09-03 Thread via GitHub
jainankitk opened a new pull request, #13711: URL: https://github.com/apache/lucene/pull/13711 ### Description Moves the anonymous `Weight` implementation in `PointRangeQuery#createWeight` to named class for better extensibility and resusability. -- This is an automated me

[I] Move anonymous Weight implementation in PointRangeQuery to named class [lucene]

2024-09-03 Thread via GitHub
jainankitk opened a new issue, #13710: URL: https://github.com/apache/lucene/issues/13710 ### Description The `Weight` implementation in `PointRangeQuery#createWeight` is anonymous class making it difficult to extend or reuse specific logic from the class. Not to mention the class be

Re: [I] Operations#sameLanguage has true negatives? [lucene]

2024-09-03 Thread via GitHub
jpountz closed issue #13709: Operations#sameLanguage has true negatives? URL: https://github.com/apache/lucene/issues/13709 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [I] Operations#sameLanguage has true negatives? [lucene]

2024-09-03 Thread via GitHub
jpountz commented on issue #13709: URL: https://github.com/apache/lucene/issues/13709#issuecomment-2326652907 Lol, this is embarrassing. Sorry for the noise. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [I] Operations#sameLanguage has true negatives? [lucene]

2024-09-03 Thread via GitHub
rmuir commented on issue #13709: URL: https://github.com/apache/lucene/issues/13709#issuecomment-2326656149 no worries, somewhat related to this, you may indeed find bugs if you have assertions disabled due to the smelliness of sameLanguage, please see https://github.com/apache/lucene/pull/

Re: [I] Operations#sameLanguage has true negatives? [lucene]

2024-09-03 Thread via GitHub
rmuir commented on issue #13709: URL: https://github.com/apache/lucene/issues/13709#issuecomment-2326646660 doesn't look like they accept the same language to me. for example the automaton "b" accepts string `aa` but not automaton "a". -- This is an automated message from the Apache Git

[I] Operations#sameLanguage has true negatives? [lucene]

2024-09-03 Thread via GitHub
jpountz opened a new issue, #13709: URL: https://github.com/apache/lucene/issues/13709 ### Description The following test fails, even though both automata accept the same language and are determinized. Looking at the implementation, it looks like it may assume that both automata are

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-03 Thread via GitHub
javanna commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2326200273 I have lowered the number of partitions we may end up creating in tests per segment to `5`, as we could end up with thousands of those. That improves GC issues with `TestBoolean2`, but o

Re: [PR] Add support for intra-segment search concurrency [lucene]

2024-09-03 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1741835108 ## lucene/core/src/java/org/apache/lucene/search/TotalHitCountCollectorManager.java: ## @@ -28,17 +31,77 @@ */ public class TotalHitCountCollectorManager imple

Re: [I] TestBoolean2.testRandomQueries fails in CI due to eating up heap space [lucene]

2024-09-03 Thread via GitHub
javanna commented on issue #11754: URL: https://github.com/apache/lucene/issues/11754#issuecomment-2326111901 Still relevant, I think the factor is the number of segments perhaps and how they get searched concurrently (e.g. number of slices), because each slice gets its own collector with i

Re: [I] RegExp::toAutomaton no longer minimizes [lucene]

2024-09-03 Thread via GitHub
ChrisHegarty commented on issue #13706: URL: https://github.com/apache/lucene/issues/13706#issuecomment-2325951431 💙 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns