[GitHub] [lucene] jpountz commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-23 Thread GitBox
jpountz commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1164220705 The fact that queries perform slower in general in your first benchmark run makes me wonder if this could be due to insufficient warmup time. The default task repeat count of 20 might be too

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558029#comment-17558029 ] Michael McCandless commented on LUCENE-10557: - Finally catching up over her

[GitHub] [lucene] gsmiller commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-23 Thread GitBox
gsmiller commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1164336774 @jpountz ah right. No, I don’t think it makes sense for users to have to deal with creating weights on their own (and having to consider query rewriting as well before doing so). Your appro

[jira] [Commented] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558093#comment-17558093 ] Alessandro Benedetti commented on LUCENE-10593: --- Hi @msokolov @mayya-shar

[jira] [Comment Edited] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558093#comment-17558093 ] Alessandro Benedetti edited comment on LUCENE-10593 at 6/23/22 1:34 PM: -

[GitHub] [lucene] msokolov commented on pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
msokolov commented on PR #926: URL: https://github.com/apache/lucene/pull/926#issuecomment-1164418508 Hi Alessandro, thank you for running the tests. I'm suspicious of the results though -- they just look too good to be true! I know from profiling that we spend most of the time in similarit

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558097#comment-17558097 ] Tomoko Uchida commented on LUCENE-10557: I'm still not fully sure if we can/sho

[GitHub] [lucene] msokolov commented on a diff in pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
msokolov commented on code in PR #926: URL: https://github.com/apache/lucene/pull/926#discussion_r905035144 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -246,7 +246,7 @@ private boolean diversityCheck( for (int i = 0; i < neighbors.size()

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
alessandrobenedetti commented on code in PR #926: URL: https://github.com/apache/lucene/pull/926#discussion_r905042674 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -246,7 +246,7 @@ private boolean diversityCheck( for (int i = 0; i < neigh

[GitHub] [lucene] jpountz commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-23 Thread GitBox
jpountz commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1164436120 @zacharymorn FYI I played with a slightly different approach that implements BMM as a bulk scorer instead of a scorer, which I was hoping would help with making bookkeeping more lightweight:

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-23 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558103#comment-17558103 ] Ignacio Vera commented on LUCENE-10396: --- I have been thinking on the ability if v

[GitHub] [lucene] kaivalnp commented on a diff in pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
kaivalnp commented on code in PR #951: URL: https://github.com/apache/lucene/pull/951#discussion_r905065952 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -92,20 +91,40 @@ public KnnVectorQuery(String field, float[] target, int k, Query filter) {

[GitHub] [lucene] kaivalnp commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
kaivalnp commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1164455602 Thank you! I have added this approach to the latest commit, and a suggestion to incorporate deletes above -- This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
alessandrobenedetti commented on code in PR #926: URL: https://github.com/apache/lucene/pull/926#discussion_r905082182 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -246,7 +246,7 @@ private boolean diversityCheck( for (int i = 0; i < neigh

[jira] [Comment Edited] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-23 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558103#comment-17558103 ] Ignacio Vera edited comment on LUCENE-10396 at 6/23/22 3:01 PM: -

[jira] [Commented] (LUCENE-9580) Tessellator failure for a certain polygon

2022-06-23 Thread Hugo Mercier (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558127#comment-17558127 ] Hugo Mercier commented on LUCENE-9580: -- I've encountered the same issue on Elastics

[GitHub] [lucene] alessandrobenedetti commented on pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
alessandrobenedetti commented on PR #926: URL: https://github.com/apache/lucene/pull/926#issuecomment-1164571753 @msokolov your input has been invaluable! I run the tests on the same machine, with the preprocessed files and now the results are different. The main and this branch presen

[jira] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593 ] Alessandro Benedetti deleted comment on LUCENE-10593: --- was (Author: alessandro.benedetti): Hi @msokolov @mayya-sharipova and @jtibshirani , I have finally finished my performance

[jira] [Commented] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558141#comment-17558141 ] Alessandro Benedetti commented on LUCENE-10593: --- Recent performance tests

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-23 Thread GitBox
jpountz commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1164586245 Thanks for taking the time to think about it @gsmiller, appreciated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [lucene] jpountz merged pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-23 Thread GitBox
jpountz merged PR #964: URL: https://github.com/apache/lucene/pull/964 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.

[jira] [Commented] (LUCENE-10620) Can we pass the Weight to Collector?

2022-06-23 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558151#comment-17558151 ] ASF subversion and git services commented on LUCENE-10620: -- Co

[jira] [Resolved] (LUCENE-10620) Can we pass the Weight to Collector?

2022-06-23 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-10620. --- Fix Version/s: 9.3 Resolution: Fixed > Can we pass the Weight to Collector? > --

[GitHub] [lucene] jtibshirani commented on a diff in pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
jtibshirani commented on code in PR #951: URL: https://github.com/apache/lucene/pull/951#discussion_r905236130 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -92,20 +91,40 @@ public KnnVectorQuery(String field, float[] target, int k, Query filter) {

[GitHub] [lucene] jtibshirani commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
jtibshirani commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1164628757 The latest approach looks good to me. Are you still seeing a significant latency improvement in some cases? -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [lucene] mdmarshmallow commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

2022-06-23 Thread GitBox
mdmarshmallow commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1164640793 Yeah, I think this change should be completely compatible with 9.30. Most of our changes are isolated to the new `facetset` package and all other changes are just adding some functions

[jira] [Created] (LUCENE-10626) Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansion and stem/flag suggestion

2022-06-23 Thread Peter Gromov (Jira)
Peter Gromov created LUCENE-10626: - Summary: Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansion and stem/flag suggestion Key: LUCENE-10626 URL: https://issues.apache.org/jira/bro

[GitHub] [lucene] donnerpeter opened a new pull request, #975: LUCENE-10626 Hunspell: add tools to aid dictionary editing

2022-06-23 Thread GitBox
donnerpeter opened a new pull request, #975: URL: https://github.com/apache/lucene/pull/975 https://issues.apache.org/jira/browse/LUCENE-10626 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] donnerpeter commented on pull request #975: LUCENE-10626 Hunspell: add tools to aid dictionary editing

2022-06-23 Thread GitBox
donnerpeter commented on PR #975: URL: https://github.com/apache/lucene/pull/975#issuecomment-1164800989 Reviewing commits separately might be easier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene] kaivalnp commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
kaivalnp commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1164860104 Yes, I saw similar improvement for `BitSet` backed queries as the numbers [here](https://github.com/apache/lucene/pull/932) -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] shahrs87 commented on a diff in pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-06-23 Thread GitBox
shahrs87 commented on code in PR #907: URL: https://github.com/apache/lucene/pull/907#discussion_r905458119 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -1378,7 +1378,7 @@ private static Status.TermIndexStatus checkFields( computedFieldCount++;

[GitHub] [lucene] shahrs87 commented on a diff in pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-06-23 Thread GitBox
shahrs87 commented on code in PR #907: URL: https://github.com/apache/lucene/pull/907#discussion_r905561688 ## lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java: ## @@ -595,7 +595,7 @@ private void setField(String field) throws IOException { DocIdSet

[jira] [Commented] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-23 Thread Weiming Wu (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558247#comment-17558247 ] Weiming Wu commented on LUCENE-10624: - Hi Adrien. Thanks for your comments!   For

[jira] [Comment Edited] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-23 Thread Weiming Wu (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558247#comment-17558247 ] Weiming Wu edited comment on LUCENE-10624 at 6/23/22 10:46 PM: --

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558263#comment-17558263 ] Michael McCandless commented on LUCENE-10557: - {quote}I'm still not fully s

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17558318#comment-17558318 ] Tomoko Uchida commented on LUCENE-10557: Seems converting Jira "table" markup t

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-23 Thread Weiming Wu (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiming Wu updated LUCENE-10624: Attachment: candiate-exponential-searchsparse-sorted.0.log > Binary Search for Sparse IndexedDISI

[GitHub] [lucene] zacharymorn commented on pull request #968: [LUCENE-10624] Binary Search for Sparse IndexedDISI advanceWithinBloc…

2022-06-23 Thread GitBox
zacharymorn commented on PR #968: URL: https://github.com/apache/lucene/pull/968#issuecomment-1165214328 Hmm I see. I'm actually also wondering if it will be possible to have one of them simply delegate to the other one (potentially indirectly via some helper method), and then check the ret

[GitHub] [lucene] zacharymorn commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-23 Thread GitBox
zacharymorn commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1165218655 Thanks @jpountz for the suggestion and also providing the bulk scorer implementation! The result looks pretty impressive as well! I just tried `taskRepeatCount=200` with my implem

[GitHub] [lucene] dweiss commented on pull request #975: LUCENE-10626 Hunspell: add tools to aid dictionary editing

2022-06-23 Thread GitBox
dweiss commented on PR #975: URL: https://github.com/apache/lucene/pull/975#issuecomment-1165237503 Hi Peter! I'll take a look later today - it's end-of-school in Poland today and it's a bit hectic. -- This is an automated message from the Apache Git Service. To respond to the message, pl