[GitHub] [lucene] mocobeta commented on pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
mocobeta commented on PR #927: URL: https://github.com/apache/lucene/pull/927#issuecomment-1145541063 > I would like to understand the performance impact of this wrapping. If you have time, it would be great to get some results from https://github.com/mikemccand/luceneutil benchmarking util

[GitHub] [lucene] mocobeta commented on a diff in pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
mocobeta commented on code in PR #927: URL: https://github.com/apache/lucene/pull/927#discussion_r888569668 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingBulkScorer.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mor

[GitHub] [lucene] mocobeta commented on a diff in pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
mocobeta commented on code in PR #927: URL: https://github.com/apache/lucene/pull/927#discussion_r888567989 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingBulkScorer.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mor

[GitHub] [lucene] pminkov commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
pminkov commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1145379881 I created a branch with some analysis of what happens, it's [here](https://github.com/pminkov/lucene/commit/25c5ea4c12d92b8f534d40e449509a327ab6eea9). The code is a bit hacky, sorry.

[GitHub] [lucene] jtibshirani commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
jtibshirani commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888247099 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitS

[GitHub] [lucene] jtibshirani commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
jtibshirani commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888237781 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitS

[GitHub] [lucene] jtibshirani commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
jtibshirani commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888237781 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitS

[GitHub] [lucene] msokolov closed pull request #913: Lucene 10577

2022-06-02 Thread GitBox
msokolov closed pull request #913: Lucene 10577 URL: https://github.com/apache/lucene/pull/913 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-un

[jira] [Commented] (LUCENE-10236) CombinedFieldsQuery to use fieldAndWeights.values() when constructing MultiNormsLeafSimScorer for scoring

2022-06-02 Thread David Smiley (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545527#comment-17545527 ] David Smiley commented on LUCENE-10236: --- If this is a "improvement", then I think

[GitHub] [lucene] mikemccand commented on pull request #633: LUCENE-10216: Use MergeScheduler and MergePolicy to run addIndexes(CodecReader[]) merges.

2022-06-02 Thread GitBox
mikemccand commented on PR #633: URL: https://github.com/apache/lucene/pull/633#issuecomment-1145016933 > > I could either wrap the runningMerges update with a synchronized (IndexWriter.this) {}, or make runningMerges a synchronizedSet. I like the second approach as it automatically fixes t

[GitHub] [lucene] rmuir commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
rmuir commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144943399 > I think the problem is that we have no test corpus to measure the MLT search quality, so we can't directly know if taking square roots of raw term frequency improves the search quality. I'm

[GitHub] [lucene] mocobeta commented on pull request #941: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement

2022-06-02 Thread GitBox
mocobeta commented on PR #941: URL: https://github.com/apache/lucene/pull/941#issuecomment-1144940195 I made it possible to specify minimum and maximum feature versions. It is nonsense in `main` for now but we'd need it for branch_9x as @rmuir pointed out in https://github.com/apache/lucen

[GitHub] [lucene] pminkov commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
pminkov commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144930676 Thanks for taking a look @mocobeta - your comment makes sense. I have a dataset and have noticed the problem on it. I'll create a short analysis with examples. -- This is an automated mes

[GitHub] [lucene] rmuir commented on a diff in pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
rmuir commented on code in PR #943: URL: https://github.com/apache/lucene/pull/943#discussion_r887992569 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -39,20 +39,33 @@ * Has no dependencies outside of standard java libraries */ public clas

[GitHub] [lucene] msokolov commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
msokolov commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r887991586 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetI

[GitHub] [lucene] mocobeta commented on a diff in pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
mocobeta commented on code in PR #943: URL: https://github.com/apache/lucene/pull/943#discussion_r887985781 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -39,20 +39,33 @@ * Has no dependencies outside of standard java libraries */ public c

[jira] [Commented] (LUCENE-10599) Improve LogMergePolicy's handling of maxMergeSize

2022-06-02 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545487#comment-17545487 ] Michael Sokolov commented on LUCENE-10599: -- I don't have any deep understandin

[GitHub] [lucene] rmuir commented on a diff in pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
rmuir commented on code in PR #943: URL: https://github.com/apache/lucene/pull/943#discussion_r887976742 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -39,20 +39,33 @@ * Has no dependencies outside of standard java libraries */ public clas

[GitHub] [lucene] mocobeta commented on pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
mocobeta commented on PR #943: URL: https://github.com/apache/lucene/pull/943#issuecomment-1144878824 ![Screenshot from 2022-06-02 22-37-38](https://user-images.githubusercontent.com/1825333/171642034-9da01f0c-2348-42bc-a08f-44114dd77ef9.png) -- This is an automated message from the A

[GitHub] [lucene] mocobeta opened a new pull request, #943: LUCENE-10578: check java version in gradle wrapper downloader

2022-06-02 Thread GitBox
mocobeta opened a new pull request, #943: URL: https://github.com/apache/lucene/pull/943 ### Description (or a Jira issue link if you have one) This applies the same change in #941 to branch_9x. Cherry-picking from main may not work (we don't have `checkVersion()` in 9x). -- This i

[GitHub] [lucene] mocobeta commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
mocobeta commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144848214 I'm using MLT somewhere else, I'll try to apply this change there and see the result with real data if I have a chance. Meanwhile, other people who are more confident with this change can m

[GitHub] [lucene] mocobeta commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
mocobeta commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144837827 Thanks for taking look at this. The change makes sense and looks consistent in the usage of TFIDFSimilarity to me. I think the problem is that we have no test corpus to measure the MLT se

[GitHub] [lucene] Deepika0510 commented on pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
Deepika0510 commented on PR #927: URL: https://github.com/apache/lucene/pull/927#issuecomment-1144700233 > I'm not convinced it's best to add a `timeAllowed` to `search` methods. This is going to be invasive if we want to support timeouts on all IndexSearcher methods? > > My suggesti

[GitHub] [lucene] LuXugang opened a new pull request, #942: LUCENE-10598: Use count to record docValueCount similar to SortedNumericDocValues did

2022-06-02 Thread GitBox
LuXugang opened a new pull request, #942: URL: https://github.com/apache/lucene/pull/942 Add a count used to record docValueCount in Lucene80DocValuesProducer#getSortedSet, similar to what SortedNumericDocValues did. -- This is an automated message from the Apache Git Service. To respond

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545357#comment-17545357 ] Adrien Grand commented on LUCENE-10598: --- +1 > SortedSetDocValues#docValueCount()

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545356#comment-17545356 ] Lu Xugang commented on LUCENE-10598: {quote}Maybe we should enhance CheckIndex to

[GitHub] [lucene] mocobeta commented on pull request #941: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement

2022-06-02 Thread GitBox
mocobeta commented on PR #941: URL: https://github.com/apache/lucene/pull/941#issuecomment-1144595521 I have clarified the pull request title for a heads-up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] kaivalnp commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
kaivalnp commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r887687260 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetI

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545318#comment-17545318 ] Lu Xugang commented on LUCENE-10598: [~jpountz]  Exactly,  I left some comment in

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545313#comment-17545313 ] Adrien Grand commented on LUCENE-10598: --- [~ChrisLu] I'm noticing that some implem

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545309#comment-17545309 ] ASF subversion and git services commented on LUCENE-10598: -- Co

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545308#comment-17545308 ] ASF subversion and git services commented on LUCENE-10598: -- Co

[GitHub] [lucene] dweiss commented on a diff in pull request #941: LUCENE-10578: check java minor/patch version when building

2022-06-02 Thread GitBox
dweiss commented on code in PR #941: URL: https://github.com/apache/lucene/pull/941#discussion_r887650993 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -64,15 +57,13 @@ public static void main(String[] args) { } public static void check

[GitHub] [lucene] kaivalnp commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
kaivalnp commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r887637829 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetI

[GitHub] [lucene] mocobeta commented on a diff in pull request #941: LUCENE-10578: check java minor/patch version when building

2022-06-02 Thread GitBox
mocobeta commented on code in PR #941: URL: https://github.com/apache/lucene/pull/941#discussion_r887625886 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -55,9 +64,18 @@ public static void main(String[] args) { } public static void chec

[GitHub] [lucene] mocobeta commented on a diff in pull request #941: LUCENE-10578: check java minor/patch version when building

2022-06-02 Thread GitBox
mocobeta commented on code in PR #941: URL: https://github.com/apache/lucene/pull/941#discussion_r887625886 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -55,9 +64,18 @@ public static void main(String[] args) { } public static void chec