[GitHub] [lucene] jpountz commented on a diff in pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-22 Thread GitBox
jpountz commented on code in PR #951: URL: https://github.com/apache/lucene/pull/951#discussion_r903360932 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -92,20 +91,40 @@ public KnnVectorQuery(String field, float[] target, int k, Query filter) {

[GitHub] [lucene] dweiss commented on a diff in pull request #970: LUCENE-10607: Fix potential integer overflow in maxArcs computions

2022-06-22 Thread GitBox
dweiss commented on code in PR #970: URL: https://github.com/apache/lucene/pull/970#discussion_r903362582 ## lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggesterBuilder.java: ## @@ -124,11 +124,15 @@ public boolean store(DataOutput output) throws IOExc

[GitHub] [lucene] shaie commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

2022-06-22 Thread GitBox
shaie commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1162752539 I don't know if what I've done is OK, but the last commit that I pushed failed the distribution tests because the API of `Facets` has changed and introduced a new `abstract` method. So I rebas

[GitHub] [lucene] tang-hi commented on a diff in pull request #970: LUCENE-10607: Fix potential integer overflow in maxArcs computions

2022-06-22 Thread GitBox
tang-hi commented on code in PR #970: URL: https://github.com/apache/lucene/pull/970#discussion_r903515469 ## lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggesterBuilder.java: ## @@ -124,11 +124,15 @@ public boolean store(DataOutput output) throws IOEx

[GitHub] [lucene] dweiss commented on a diff in pull request #970: LUCENE-10607: Fix potential integer overflow in maxArcs computions

2022-06-22 Thread GitBox
dweiss commented on code in PR #970: URL: https://github.com/apache/lucene/pull/970#discussion_r903736402 ## lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggesterBuilder.java: ## @@ -124,11 +124,15 @@ public boolean store(DataOutput output) throws IOExc

[jira] [Commented] (LUCENE-10614) Properly support getTopChildren in RangeFacetCounts

2022-06-22 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557453#comment-17557453 ] Greg Miller commented on LUCENE-10614: -- Great, thanks [~yutinggan] ! > Properly s

[jira] [Resolved] (LUCENE-10550) Add getAllChildren functionality to facets

2022-06-22 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller resolved LUCENE-10550. -- Fix Version/s: 9.3 Resolution: Fixed Thanks again [~yutinggan] ! > Add getAllChildren

[GitHub] [lucene] gsmiller commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-22 Thread GitBox
gsmiller commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1163113856 > I've been wondering if it was worth adding a new API only for TotalHitCountCollector but looking at how facets use this collector, I suspect that many users set up their collectors manual

[GitHub] [lucene] gsmiller commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

2022-06-22 Thread GitBox
gsmiller commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1163142014 From my point of view, I think we're ready to ship this thing! Thanks @mdmarshmallow and @shaie! -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-22 Thread GitBox
jpountz commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1163172835 I guess I was implicitly raising the question of whether it's worth optimizing `TotalHitCountCollector` to leverage `Weight#count` given that `IndexSearcher#count` is already optimized to le

[GitHub] [lucene] tang-hi commented on a diff in pull request #970: LUCENE-10607: Fix potential integer overflow in maxArcs computions

2022-06-22 Thread GitBox
tang-hi commented on code in PR #970: URL: https://github.com/apache/lucene/pull/970#discussion_r903912271 ## lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggesterBuilder.java: ## @@ -124,11 +124,15 @@ public boolean store(DataOutput output) throws IOEx

[GitHub] [lucene] alessandrobenedetti commented on pull request #926: VectorSimilarityFunction reverse removal

2022-06-22 Thread GitBox
alessandrobenedetti commented on PR #926: URL: https://github.com/apache/lucene/pull/926#issuecomment-1163269198 Hi @msokolov @mayya-sharipova and @jtibshirani , I have finally finished my performance tests. Initially the results were worse in this branch, I found that suspicious as I e

[jira] [Created] (LUCENE-10625) addBackcompatIndexes.py on 9x can't handle 8x indices

2022-06-22 Thread Mike Drob (Jira)
Mike Drob created LUCENE-10625: -- Summary: addBackcompatIndexes.py on 9x can't handle 8x indices Key: LUCENE-10625 URL: https://issues.apache.org/jira/browse/LUCENE-10625 Project: Lucene - Core I

[GitHub] [lucene] gsmiller commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-22 Thread GitBox
gsmiller commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1163388135 Got it, thanks @jpountz. I haven't looked at this proposed PR in much detail, so apologies in advance if this suggestion is nonsense, but if we want to optimize hit counting for users that

[GitHub] [lucene] gautamworah96 commented on a diff in pull request #922: Index only the docs for FacetField posting list

2022-06-22 Thread GitBox
gautamworah96 commented on code in PR #922: URL: https://github.com/apache/lucene/pull/922#discussion_r904112058 ## lucene/CHANGES.txt: ## @@ -67,6 +67,8 @@ Other * LUCENE-10493: Factor out Viterbi algorithm in Kuromoji and Nori to analysis-common. (Tomoko Uchida) +* GITHU

[GitHub] [lucene] gsmiller merged pull request #922: Index only the docs for FacetField posting list

2022-06-22 Thread GitBox
gsmiller merged PR #922: URL: https://github.com/apache/lucene/pull/922 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache

[GitHub] [lucene] gsmiller commented on pull request #922: Index only the docs for FacetField posting list

2022-06-22 Thread GitBox
gsmiller commented on PR #922: URL: https://github.com/apache/lucene/pull/922#issuecomment-1163504036 @gautamworah96 merged; thanks again! Would you mind opening a separate PR to backport? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [lucene] wuwm commented on pull request #968: [LUCENE-10624] Binary Search for Sparse IndexedDISI advanceWithinBloc…

2022-06-22 Thread GitBox
wuwm commented on PR #968: URL: https://github.com/apache/lucene/pull/968#issuecomment-1163529556 Thanks @zacharymorn for comments! There are some implementation diff inside binary search between two methods to handle some edge cases. To make binary search into a single common method

[GitHub] [lucene] dweiss merged pull request #970: LUCENE-10607: Fix potential integer overflow in maxArcs computions

2022-06-22 Thread GitBox
dweiss merged PR #970: URL: https://github.com/apache/lucene/pull/970 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.o

[GitHub] [lucene] dweiss commented on pull request #970: LUCENE-10607: Fix potential integer overflow in maxArcs computions

2022-06-22 Thread GitBox
dweiss commented on PR #970: URL: https://github.com/apache/lucene/pull/970#issuecomment-1163564976 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[jira] [Commented] (LUCENE-10607) NRTSuggesterBuilder扩展input时溢出

2022-06-22 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557680#comment-17557680 ] ASF subversion and git services commented on LUCENE-10607: -- Co

[jira] [Commented] (LUCENE-10607) NRTSuggesterBuilder扩展input时溢出

2022-06-22 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557681#comment-17557681 ] ASF subversion and git services commented on LUCENE-10607: -- Co

[jira] [Resolved] (LUCENE-10607) NRTSuggesterBuilder扩展input时溢出

2022-06-22 Thread Dawid Weiss (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-10607. -- Fix Version/s: 9.3 Resolution: Fixed > NRTSuggesterBuilder扩展input时溢出 > ---

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-22 Thread GitBox
jpountz commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1163587114 Thanks for looking @gsmiller ! This helps leverage `TotalHitCountCollector` internally from `IndexSearcher#count`, but if users wish to use `IndexSearcher#search(Query, Collector)`, how can

[GitHub] [lucene] jpountz commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-22 Thread GitBox
jpountz commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1163599006 My best guess would be that you are seeing different results mostly because luceneutil picks random queries, and the run that only had disjunctions picked queries that happened to like your

[GitHub] [lucene] gautamworah96 opened a new pull request, #973: Backport: Index only the docs for FacetField posting list (#922)

2022-06-22 Thread GitBox
gautamworah96 opened a new pull request, #973: URL: https://github.com/apache/lucene/pull/973 cc: @gsmiller Previous PR: #922 I've retained the same PR number in the CHANGES entry as that PR contains the original discussions around the issue. -- This is an automated message f

[GitHub] [lucene] mdmarshmallow commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

2022-06-22 Thread GitBox
mdmarshmallow commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1163620698 I think the rebase was somehow messed up, I cleaned up the history and force pushed. Everything should be included in this push. -- This is an automated message from the Apache Git S

[GitHub] [lucene] madrob commented on pull request #973: Backport: Index only the docs for FacetField posting list (#922)

2022-06-22 Thread GitBox
madrob commented on PR #973: URL: https://github.com/apache/lucene/pull/973#issuecomment-1163628381 Hmm... I had some issues with forward porting the backward compatibility indices because the release wizards didn't quite do it for me. Apologies for the spill over, let me figure this out.

[GitHub] [lucene] Yuti-G opened a new pull request, #974: LUCENE-10614: Properly support getTopChildren in RangeFacetCounts

2022-06-22 Thread GitBox
Yuti-G opened a new pull request, #974: URL: https://github.com/apache/lucene/pull/974 ### Description https://issues.apache.org/jira/browse/LUCENE-10614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [lucene] gautamworah96 commented on pull request #973: Backport: Index only the docs for FacetField posting list (#922)

2022-06-22 Thread GitBox
gautamworah96 commented on PR #973: URL: https://github.com/apache/lucene/pull/973#issuecomment-1163618390 Gradlew tests are failing on the `TestBackwardsCompatibility.testSortedIndex` test. I binary searched through recent commits. Tests after the 84133e7aecbf7ece459ebd3c3ce4e3266f30d558 c

[GitHub] [lucene] madrob commented on pull request #973: Backport: Index only the docs for FacetField posting list (#922)

2022-06-22 Thread GitBox
madrob commented on PR #973: URL: https://github.com/apache/lucene/pull/973#issuecomment-1163718444 I reverted that commit for now, you might need to rebase your changes or merge the latest branch_9x to get the tests back to passing. Will continue to investigate, but you should be unblocked

[GitHub] [lucene] gautamworah96 commented on pull request #973: Backport: Index only the docs for FacetField posting list (#922)

2022-06-22 Thread GitBox
gautamworah96 commented on PR #973: URL: https://github.com/apache/lucene/pull/973#issuecomment-1163744134 Yes, Thanks for the quick turnaround. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [lucene] gsmiller merged pull request #973: Backport: Remove unused and confusing FacetField indexing options (#922)

2022-06-22 Thread GitBox
gsmiller merged PR #973: URL: https://github.com/apache/lucene/pull/973 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache

[GitHub] [lucene] gsmiller commented on pull request #973: Backport: Remove unused and confusing FacetField indexing options (#922)

2022-06-22 Thread GitBox
gsmiller commented on PR #973: URL: https://github.com/apache/lucene/pull/973#issuecomment-116356 Thanks @gautamworah96 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] jtibshirani commented on a diff in pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-22 Thread GitBox
jtibshirani commented on code in PR #951: URL: https://github.com/apache/lucene/pull/951#discussion_r904419498 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -92,20 +91,40 @@ public KnnVectorQuery(String field, float[] target, int k, Query filter) {

[GitHub] [lucene] jtibshirani commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-22 Thread GitBox
jtibshirani commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1163800086 I looked into this more deeply and realized that there are a bunch of times we decide not to cache a query into a `BitSet`. For example `UsageTrackingQueryCachingPolicy#shouldNeverCache`

[GitHub] [lucene] zacharymorn commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-22 Thread GitBox
zacharymorn commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1163861983 Thanks @jpountz for looking into this! I did further experiments on this and the result seems to suggest it may be caused by bug / caching in the util or lucene itself. What I did

[GitHub] [lucene] shaie commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

2022-06-22 Thread GitBox
shaie commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1163904822 Thanks @mdmarshmallow. You added the CHANGES entry under `Lucene 9.30` so am just verifying -- we're going to merge it both to `main` and `9.x` branches? -- This is an automated message from