Re: [PR] Add BitVectors format and make flat vectors format easier to extend [lucene]

2024-04-11 Thread via GitHub
jpountz commented on code in PR #13288: URL: https://github.com/apache/lucene/pull/13288#discussion_r1562101541 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/bitvectors/HnswBitVectorsFormat.java: ## @@ -0,0 +1,211 @@ +/* + * Licensed to the Apache Software Foundati

Re: [PR] Binary search all terms. [lucene]

2024-04-11 Thread via GitHub
github-actions[bot] commented on PR #13192: URL: https://github.com/apache/lucene/pull/13192#issuecomment-2050752083 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] LUCENE-4056: Japanese Tokenizer (Kuromoji) cannot build UniDic dictionary [lucene]

2024-04-11 Thread via GitHub
github-actions[bot] commented on PR #12517: URL: https://github.com/apache/lucene/pull/12517#issuecomment-2050752715 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Made the UnifiedHighlighter's hasUnrecognizedQuery function processes FunctionQuery the same way as MatchAllDocsQuery and MatchNoDocsQuery queries for performance reasons. [lucene]

2024-04-11 Thread via GitHub
github-actions[bot] commented on PR #13165: URL: https://github.com/apache/lucene/pull/13165#issuecomment-2050752122 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Backport #13202 to branch_9x [lucene]

2024-04-11 Thread via GitHub
benwtrent commented on code in PR #13295: URL: https://github.com/apache/lucene/pull/13295#discussion_r1561781640 ## lucene/core/src/test/org/apache/lucene/search/BaseKnnVectorQueryTestCase.java: ## @@ -797,6 +800,77 @@ public void testBitSetQuery() throws IOException { }

Re: [PR] Backport #13202 to branch_9x [lucene]

2024-04-11 Thread via GitHub
vigyasharma commented on PR #13295: URL: https://github.com/apache/lucene/pull/13295#issuecomment-2050540712 Getting the same test failure here. ```java > Task :lucene:core:compileTestJava /home/runner/work/lucene/lucene/lucene/core/src/test/org/apache/lucene/search/BaseKnnVecto

Re: [PR] Backport #13202 to branch_9x [lucene]

2024-04-11 Thread via GitHub
vigyasharma commented on PR #13295: URL: https://github.com/apache/lucene/pull/13295#issuecomment-2050384100 Now that the test failure have been fixed, I'm backporting #13202 to `branck_9x`. Running into some build issues on my local setup, which I think are unrelated to the commit. Opening

[PR] backport gh 13202 [lucene]

2024-04-11 Thread via GitHub
vigyasharma opened a new pull request, #13295: URL: https://github.com/apache/lucene/pull/13295 - Add timeout support to AbstractKnnVectorQuery (#13202) - Fix failing BaseKnnVectorQueryTestCase#testTimeout (#13283) -- This is an automated message from the Apache Git Service. To resp

Re: [PR] Increase the default number of merge threads. [lucene]

2024-04-11 Thread via GitHub
jpountz merged PR #13294: URL: https://github.com/apache/lucene/pull/13294 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] Merge rate limiting should allow for some bursting? [lucene]

2024-04-11 Thread via GitHub
jpountz closed issue #13193: Merge rate limiting should allow for some bursting? URL: https://github.com/apache/lucene/issues/13193 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] Disable ConcurrentMergeScheduler's auto I/O throttling by default. [lucene]

2024-04-11 Thread via GitHub
jpountz merged PR #13293: URL: https://github.com/apache/lucene/pull/13293 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Add BitVectors format and make flat vectors format easier to extend [lucene]

2024-04-11 Thread via GitHub
jimczi commented on code in PR #13288: URL: https://github.com/apache/lucene/pull/13288#discussion_r1561065045 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/bitvectors/HnswBitVectorsFormat.java: ## @@ -0,0 +1,211 @@ +/* + * Licensed to the Apache Software Foundatio

Re: [PR] Add BitVectors format and make flat vectors format easier to extend [lucene]

2024-04-11 Thread via GitHub
benwtrent commented on code in PR #13288: URL: https://github.com/apache/lucene/pull/13288#discussion_r1560987003 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/bitvectors/HnswBitVectorsFormat.java: ## @@ -0,0 +1,211 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] Increase the default number of merge threads. [lucene]

2024-04-11 Thread via GitHub
rmuir commented on PR #13294: URL: https://github.com/apache/lucene/pull/13294#issuecomment-2049620950 I feel like dynamic thread pools never work well in java apps. I have to always set simple static fixedthreadpool everywhere for anyone's tomcat or jetty or anything else that I find, to a

Re: [PR] Increase the default number of merge threads. [lucene]

2024-04-11 Thread via GitHub
jpountz commented on PR #13294: URL: https://github.com/apache/lucene/pull/13294#issuecomment-2049613669 > Maybe in the future (after this change) we could think about a more adaptive approach that'd spin up additional merge threads if the merge cost/time is highish (many vectors, few store

Re: [PR] Add BitVectors format and make flat vectors format easier to extend [lucene]

2024-04-11 Thread via GitHub
jimczi commented on code in PR #13288: URL: https://github.com/apache/lucene/pull/13288#discussion_r1560956676 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/bitvectors/HnswBitVectorsFormat.java: ## @@ -0,0 +1,211 @@ +/* + * Licensed to the Apache Software Foundatio

Re: [PR] Add BitVectors format and make flat vectors format easier to extend [lucene]

2024-04-11 Thread via GitHub
benwtrent commented on code in PR #13288: URL: https://github.com/apache/lucene/pull/13288#discussion_r1560939497 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/bitvectors/HnswBitVectorsFormat.java: ## @@ -0,0 +1,211 @@ +/* + * Licensed to the Apache Software Founda

Re: [PR] [LUCENE-13044][replicator] NRT add configurable commitData for Custom… [lucene]

2024-04-11 Thread via GitHub
mikemccand commented on PR #13045: URL: https://github.com/apache/lucene/pull/13045#issuecomment-2049567118 I think this is a reasonable hook to add, but could you maybe add javadocs to the new protected method? Maybe something like: ``` /** Applications can subclass and overr

Re: [PR] Disable ConcurrentMergeScheduler's auto I/O throttling by default. [lucene]

2024-04-11 Thread via GitHub
uschindler commented on code in PR #13293: URL: https://github.com/apache/lucene/pull/13293#discussion_r1560921004 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -108,7 +108,7 @@ public class ConcurrentMergeScheduler extends MergeScheduler

Re: [PR] Remove unnecessary calculating for termLen. [lucene]

2024-04-11 Thread via GitHub
mikemccand merged PR #13291: URL: https://github.com/apache/lucene/pull/13291 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] IndexWriter: Treat java.lang.Error as tragedy [lucene]

2024-04-11 Thread via GitHub
mikemccand commented on PR #13277: URL: https://github.com/apache/lucene/pull/13277#issuecomment-2049542084 > AssertionError is ok too I think, if a user wants to run with -ea, let's make it worth their while. Cool! -- This is an automated message from the Apache Git Service. To re

Re: [PR] Disable ConcurrentMergeScheduler's auto I/O throttling by default. [lucene]

2024-04-11 Thread via GitHub
jpountz commented on code in PR #13293: URL: https://github.com/apache/lucene/pull/13293#discussion_r1560906920 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -108,7 +108,7 @@ public class ConcurrentMergeScheduler extends MergeScheduler {

Re: [PR] Increase the default number of merge threads. [lucene]

2024-04-11 Thread via GitHub
mikemccand commented on PR #13294: URL: https://github.com/apache/lucene/pull/13294#issuecomment-2049448934 Maybe in the future (after this change) we could think about a more adaptive approach that'd spin up additional merge threads if the merge cost/time is highish (many vectors, few stor

Re: [PR] Increase the default number of merge threads. [lucene]

2024-04-11 Thread via GitHub
mikemccand commented on PR #13294: URL: https://github.com/apache/lucene/pull/13294#issuecomment-2049446971 > One goal of this change is to no longer have to configure a custom number of merge threads on nightly benchmarks, which run on a highly concurrent machine. +1, that's be grea

Re: [PR] Disable ConcurrentMergeScheduler's auto I/O throttling by default. [lucene]

2024-04-11 Thread via GitHub
mikemccand commented on code in PR #13293: URL: https://github.com/apache/lucene/pull/13293#discussion_r1560825330 ## lucene/core/src/java/org/apache/lucene/index/ConcurrentMergeScheduler.java: ## @@ -108,7 +108,7 @@ public class ConcurrentMergeScheduler extends MergeScheduler

Re: [PR] fix s/Long/Fixed in FixedBitSet javadocs [lucene]

2024-04-11 Thread via GitHub
cpoerschke merged PR #13290: URL: https://github.com/apache/lucene/pull/13290 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

[PR] Increase the default number of merge threads. [lucene]

2024-04-11 Thread via GitHub
jpountz opened a new pull request, #13294: URL: https://github.com/apache/lucene/pull/13294 You need as many merge threads as necessary to make sure that merges can keep up with indexing. But this number depends on the data that you are indexing: if you are only indexing stored fields, merg

[PR] Disable ConcurrentMergeScheduler's auto I/O throttling by default. [lucene]

2024-04-11 Thread via GitHub
jpountz opened a new pull request, #13293: URL: https://github.com/apache/lucene/pull/13293 Disable ConcurrentMergeScheduler's auto I/O throttling by default. This is motivated by the fact that merges can hardly steal all I/O resources from searches on modern NVMe drives. Merges

Re: [I] Remove deprecated code in `main` [lucene]

2024-04-11 Thread via GitHub
jpountz closed issue #13262: Remove deprecated code in `main` URL: https://github.com/apache/lucene/issues/13262 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Remove deprecated code [lucene]

2024-04-11 Thread via GitHub
jpountz merged PR #13286: URL: https://github.com/apache/lucene/pull/13286 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] IndexWriter: Treat java.lang.Error as tragedy [lucene]

2024-04-11 Thread via GitHub
uschindler commented on PR #13277: URL: https://github.com/apache/lucene/pull/13277#issuecomment-2049057023 > * all subclasses of LinkageError read like "code is broken" to me. Yes, this is why the code around MMapDir and MethodHandles in general is correct and needs no change. errorp