[GitHub] [lucene] mocobeta commented on a diff in pull request #941: LUCENE-10578: check java minor/patch version when building

2022-06-02 Thread GitBox
mocobeta commented on code in PR #941: URL: https://github.com/apache/lucene/pull/941#discussion_r887625886 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -55,9 +64,18 @@ public static void main(String[] args) { } public static void chec

[GitHub] [lucene] mocobeta commented on a diff in pull request #941: LUCENE-10578: check java minor/patch version when building

2022-06-02 Thread GitBox
mocobeta commented on code in PR #941: URL: https://github.com/apache/lucene/pull/941#discussion_r887625886 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -55,9 +64,18 @@ public static void main(String[] args) { } public static void chec

[GitHub] [lucene] kaivalnp commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
kaivalnp commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r887637829 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetI

[GitHub] [lucene] dweiss commented on a diff in pull request #941: LUCENE-10578: check java minor/patch version when building

2022-06-02 Thread GitBox
dweiss commented on code in PR #941: URL: https://github.com/apache/lucene/pull/941#discussion_r887650993 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -64,15 +57,13 @@ public static void main(String[] args) { } public static void check

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545309#comment-17545309 ] ASF subversion and git services commented on LUCENE-10598: -- Co

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545308#comment-17545308 ] ASF subversion and git services commented on LUCENE-10598: -- Co

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545313#comment-17545313 ] Adrien Grand commented on LUCENE-10598: --- [~ChrisLu] I'm noticing that some implem

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545318#comment-17545318 ] Lu Xugang commented on LUCENE-10598: [~jpountz]  Exactly,  I left some comment in

[GitHub] [lucene] kaivalnp commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
kaivalnp commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r887687260 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetI

[GitHub] [lucene] mocobeta commented on pull request #941: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement

2022-06-02 Thread GitBox
mocobeta commented on PR #941: URL: https://github.com/apache/lucene/pull/941#issuecomment-1144595521 I have clarified the pull request title for a heads-up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Lu Xugang (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545356#comment-17545356 ] Lu Xugang commented on LUCENE-10598: {quote}Maybe we should enhance CheckIndex to

[jira] [Commented] (LUCENE-10598) SortedSetDocValues#docValueCount() should be always greater than zero

2022-06-02 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545357#comment-17545357 ] Adrien Grand commented on LUCENE-10598: --- +1 > SortedSetDocValues#docValueCount()

[GitHub] [lucene] LuXugang opened a new pull request, #942: LUCENE-10598: Use count to record docValueCount similar to SortedNumericDocValues did

2022-06-02 Thread GitBox
LuXugang opened a new pull request, #942: URL: https://github.com/apache/lucene/pull/942 Add a count used to record docValueCount in Lucene80DocValuesProducer#getSortedSet, similar to what SortedNumericDocValues did. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [lucene] Deepika0510 commented on pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
Deepika0510 commented on PR #927: URL: https://github.com/apache/lucene/pull/927#issuecomment-1144700233 > I'm not convinced it's best to add a `timeAllowed` to `search` methods. This is going to be invasive if we want to support timeouts on all IndexSearcher methods? > > My suggesti

[GitHub] [lucene] mocobeta commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
mocobeta commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144837827 Thanks for taking look at this. The change makes sense and looks consistent in the usage of TFIDFSimilarity to me. I think the problem is that we have no test corpus to measure the MLT se

[GitHub] [lucene] mocobeta commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
mocobeta commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144848214 I'm using MLT somewhere else, I'll try to apply this change there and see the result with real data if I have a chance. Meanwhile, other people who are more confident with this change can m

[GitHub] [lucene] mocobeta opened a new pull request, #943: LUCENE-10578: check java version in gradle wrapper downloader

2022-06-02 Thread GitBox
mocobeta opened a new pull request, #943: URL: https://github.com/apache/lucene/pull/943 ### Description (or a Jira issue link if you have one) This applies the same change in #941 to branch_9x. Cherry-picking from main may not work (we don't have `checkVersion()` in 9x). -- This i

[GitHub] [lucene] mocobeta commented on pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
mocobeta commented on PR #943: URL: https://github.com/apache/lucene/pull/943#issuecomment-1144878824 ![Screenshot from 2022-06-02 22-37-38](https://user-images.githubusercontent.com/1825333/171642034-9da01f0c-2348-42bc-a08f-44114dd77ef9.png) -- This is an automated message from the A

[GitHub] [lucene] rmuir commented on a diff in pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
rmuir commented on code in PR #943: URL: https://github.com/apache/lucene/pull/943#discussion_r887976742 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -39,20 +39,33 @@ * Has no dependencies outside of standard java libraries */ public clas

[jira] [Commented] (LUCENE-10599) Improve LogMergePolicy's handling of maxMergeSize

2022-06-02 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545487#comment-17545487 ] Michael Sokolov commented on LUCENE-10599: -- I don't have any deep understandin

[GitHub] [lucene] mocobeta commented on a diff in pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
mocobeta commented on code in PR #943: URL: https://github.com/apache/lucene/pull/943#discussion_r887985781 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -39,20 +39,33 @@ * Has no dependencies outside of standard java libraries */ public c

[GitHub] [lucene] msokolov commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
msokolov commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r887991586 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitSetI

[GitHub] [lucene] rmuir commented on a diff in pull request #943: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement (for 9x)

2022-06-02 Thread GitBox
rmuir commented on code in PR #943: URL: https://github.com/apache/lucene/pull/943#discussion_r887992569 ## buildSrc/src/main/java/org/apache/lucene/gradle/WrapperDownloader.java: ## @@ -39,20 +39,33 @@ * Has no dependencies outside of standard java libraries */ public clas

[GitHub] [lucene] pminkov commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
pminkov commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144930676 Thanks for taking a look @mocobeta - your comment makes sense. I have a dataset and have noticed the problem on it. I'll create a short analysis with examples. -- This is an automated mes

[GitHub] [lucene] mocobeta commented on pull request #941: LUCENE-10578: Fail build if java minor/patch version is not met the minimum requirement

2022-06-02 Thread GitBox
mocobeta commented on PR #941: URL: https://github.com/apache/lucene/pull/941#issuecomment-1144940195 I made it possible to specify minimum and maximum feature versions. It is nonsense in `main` for now but we'd need it for branch_9x as @rmuir pointed out in https://github.com/apache/lucen

[GitHub] [lucene] rmuir commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
rmuir commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1144943399 > I think the problem is that we have no test corpus to measure the MLT search quality, so we can't directly know if taking square roots of raw term frequency improves the search quality. I'm

[GitHub] [lucene] mikemccand commented on pull request #633: LUCENE-10216: Use MergeScheduler and MergePolicy to run addIndexes(CodecReader[]) merges.

2022-06-02 Thread GitBox
mikemccand commented on PR #633: URL: https://github.com/apache/lucene/pull/633#issuecomment-1145016933 > > I could either wrap the runningMerges update with a synchronized (IndexWriter.this) {}, or make runningMerges a synchronizedSet. I like the second approach as it automatically fixes t

[jira] [Commented] (LUCENE-10236) CombinedFieldsQuery to use fieldAndWeights.values() when constructing MultiNormsLeafSimScorer for scoring

2022-06-02 Thread David Smiley (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545527#comment-17545527 ] David Smiley commented on LUCENE-10236: --- If this is a "improvement", then I think

[GitHub] [lucene] msokolov closed pull request #913: Lucene 10577

2022-06-02 Thread GitBox
msokolov closed pull request #913: Lucene 10577 URL: https://github.com/apache/lucene/pull/913 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-un

[GitHub] [lucene] jtibshirani commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
jtibshirani commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888237781 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitS

[GitHub] [lucene] jtibshirani commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
jtibshirani commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888237781 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitS

[GitHub] [lucene] jtibshirani commented on a diff in pull request #932: LUCENE-10559: Add Prefilter Option to KnnGraphTester

2022-06-02 Thread GitBox
jtibshirani commented on code in PR #932: URL: https://github.com/apache/lucene/pull/932#discussion_r888247099 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -225,6 +225,11 @@ public BitSetIterator getIterator(int contextOrd) { return new BitS

[GitHub] [lucene] pminkov commented on pull request #940: Use similarity.tf() in MoreLikeThis

2022-06-02 Thread GitBox
pminkov commented on PR #940: URL: https://github.com/apache/lucene/pull/940#issuecomment-1145379881 I created a branch with some analysis of what happens, it's [here](https://github.com/pminkov/lucene/commit/25c5ea4c12d92b8f534d40e449509a327ab6eea9). The code is a bit hacky, sorry.

[GitHub] [lucene] mocobeta commented on a diff in pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
mocobeta commented on code in PR #927: URL: https://github.com/apache/lucene/pull/927#discussion_r888567989 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingBulkScorer.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mor

[GitHub] [lucene] mocobeta commented on a diff in pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
mocobeta commented on code in PR #927: URL: https://github.com/apache/lucene/pull/927#discussion_r888569668 ## lucene/core/src/java/org/apache/lucene/search/TimeLimitingBulkScorer.java: ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or mor

[GitHub] [lucene] mocobeta commented on pull request #927: LUCENE-10151: Adding Timeout Support to IndexSearcher

2022-06-02 Thread GitBox
mocobeta commented on PR #927: URL: https://github.com/apache/lucene/pull/927#issuecomment-1145541063 > I would like to understand the performance impact of this wrapping. If you have time, it would be great to get some results from https://github.com/mikemccand/luceneutil benchmarking util