Re: [PR] TaskExecutor should not fork unnecessarily [lucene]

2024-07-04 Thread via GitHub
javanna commented on code in PR #13472: URL: https://github.com/apache/lucene/pull/13472#discussion_r1665263601 ## lucene/core/src/test/org/apache/lucene/search/TestTaskExecutor.java: ## @@ -251,14 +271,15 @@ public void testInvokeAllDoesNotLeaveTasksBehind() { for (int i =

Re: [PR] TaskExecutor should not fork unnecessarily [lucene]

2024-07-04 Thread via GitHub
javanna commented on code in PR #13472: URL: https://github.com/apache/lucene/pull/13472#discussion_r1665266009 ## lucene/CHANGES.txt: ## @@ -277,6 +277,15 @@ Optimizations * GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if it grows. (Stefan Vodita

Re: [PR] TaskExecutor should not fork unnecessarily [lucene]

2024-07-04 Thread via GitHub
original-brownbear commented on code in PR #13472: URL: https://github.com/apache/lucene/pull/13472#discussion_r1665341009 ## lucene/CHANGES.txt: ## @@ -277,6 +277,15 @@ Optimizations * GITHUB#12941: Don't preserve auxiliary buffer contents in LSBRadixSorter if it grows. (St

Re: [PR] TaskExecutor should not fork unnecessarily [lucene]

2024-07-04 Thread via GitHub
javanna merged PR #13472: URL: https://github.com/apache/lucene/pull/13472 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[PR] Add XNOR in FixedBitSet. [lucene]

2024-07-04 Thread via GitHub
vsop-479 opened a new pull request, #13540: URL: https://github.com/apache/lucene/pull/13540 ### Description Trying to get `softLiveDocs` from `liveDocs` and `hardLiveDocs` by using XNOR. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] TaskExecutor should not fork unnecessarily [lucene]

2024-07-04 Thread via GitHub
javanna commented on PR #13472: URL: https://github.com/apache/lucene/pull/13472#issuecomment-2208500325 Thanks @original-brownbear ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [9.x] TaskExecutor should not fork unnecessarily (#13472) [lucene]

2024-07-04 Thread via GitHub
javanna opened a new pull request, #13541: URL: https://github.com/apache/lucene/pull/13541 When an executor is provided to the IndexSearcher constructor, the searcher now executes tasks on the thread that invoked a search as well as its configured executor. Users should reduce the executor

Re: [PR] Only search soft deleted in SoftDeletesRetentionMergePolicy.applyRetentionQuery [lucene]

2024-07-04 Thread via GitHub
vsop-479 commented on PR #13536: URL: https://github.com/apache/lucene/pull/13536#issuecomment-2208519800 I think we can get `softLiveDocs` by from liveDocs and hardLiveDocs by using XNOR (https://github.com/apache/lucene/pull/13540), and use it replace `wrapLiveDocs`. Or, we can re

Re: [PR] [9.x] TaskExecutor should not fork unnecessarily (#13472) [lucene]

2024-07-04 Thread via GitHub
javanna merged PR #13541: URL: https://github.com/apache/lucene/pull/13541 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-04 Thread via GitHub
benwtrent commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2208862938 @naveentatikonda this is a WIP :). > Why are we setting SIGNED_CORRECTION as 127 instead of 128 ? My logic here was scaling by what we will round by. But thinking ab

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-04 Thread via GitHub
benwtrent commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2208933244 I pushed a change to my branch. - Scales by 128 - Rounds to between -127, 127 - Applies a modification of the rounding error compensation. The idea behind how I applie

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
ChrisHegarty commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2208974840 I updated clone and slice. It finds some issues. These look like incorrect usage of READONCE. -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
ChrisHegarty commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2208977180 ... there's still some intermittent failures because of this. I'll track them down so that we can discuss. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on code in PR #13535: URL: https://github.com/apache/lucene/pull/13535#discussion_r1665740237 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInputProvider.java: ## @@ -45,7 +45,8 @@ public IndexInput openInput(Path path, IOContext contex

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2209054786 What intermittent failures did you see? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2209086060 To me those problems where a CFS file was opened with READONCE look like a bug. It is also unlikely that the CFS file is just opened once and then closed again. If we want to k

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2209087721 > To me those problems where a CFS file was opened with READONCE look like a bug. It is also unlikely that the CFS file is just opened once and then closed again. > > If we wan

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2209152707 Hi, when thinking about this. ;Maybe we should for now just disallow clones and not slices. Slices are needed for CFS files. But nevertheless the code affected by the change

[PR] WIP: draft of intra segment concurrency [lucene]

2024-07-04 Thread via GitHub
javanna opened a new pull request, #13542: URL: https://github.com/apache/lucene/pull/13542 I experimented trying to introduce intra-segment concurrency in Lucene, by leveraging the existing `Scorer#score` method that takes a range of id as argument, and switching the searcher to call that

Re: [I] Decouple within-query concurrency from the index's segment geometry [LUCENE-8675] [lucene]

2024-07-04 Thread via GitHub
javanna commented on issue #9721: URL: https://github.com/apache/lucene/issues/9721#issuecomment-2209168739 I opened an initial draft of my take at intra segment concurrency (#13542) . It needs quite a bit of work and discussion, but I hope it helps as start, hopefully getting intra-segment

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-04 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1665816216 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -362,6 +362,9 @@ public long cost() { final IntersectVisitor visitor = getInt

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-04 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1665818254 ## lucene/core/src/test/org/apache/lucene/index/TestSegmentToThreadMapping.java: ## @@ -160,83 +155,133 @@ public CacheHelper getReaderCacheHelper() { }; } -

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-04 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1665818950 ## lucene/core/src/test/org/apache/lucene/search/TestSortRandom.java: ## @@ -119,7 +119,8 @@ private void testRandomStringSort(SortField.Type type) throws Exception {

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2209186177 Thanks, looks fine now. I think we can revert the changes to the SegmentReader and look into this in another issue. It still looks strage not me, but I had not time to look closely in

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on code in PR #13535: URL: https://github.com/apache/lucene/pull/13535#discussion_r1665828693 ## lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java: ## @@ -578,7 +578,8 @@ public synchronized boolean writeFieldUpdates( // IndexWriter.

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2209201436 > Thanks, looks fine now. I think we can revert the changes to the SegmentReader and look into this in another issue. It still looks strange not me, but I had not time to look closely

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on code in PR #13535: URL: https://github.com/apache/lucene/pull/13535#discussion_r1665836026 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -591,15 +611,17 @@ MemorySegmentIndexInput buildSlice(String sliceDescription

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
ChrisHegarty commented on code in PR #13535: URL: https://github.com/apache/lucene/pull/13535#discussion_r1665858393 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -591,15 +611,17 @@ MemorySegmentIndexInput buildSlice(String sliceDescripti

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-04 Thread via GitHub
stefanvodita commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1665869563 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -328,42 +336,65 @@ protected LeafSlice[] slices(List leaves) { /** Static method to

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-04 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1665874144 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -328,42 +336,65 @@ protected LeafSlice[] slices(List leaves) { /** Static method to segr

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
ChrisHegarty commented on code in PR #13535: URL: https://github.com/apache/lucene/pull/13535#discussion_r1665875714 ## lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java: ## @@ -578,7 +578,8 @@ public synchronized boolean writeFieldUpdates( // IndexWrite

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-04 Thread via GitHub
uschindler commented on code in PR #13535: URL: https://github.com/apache/lucene/pull/13535#discussion_r1665893332 ## lucene/core/src/java/org/apache/lucene/index/ReadersAndUpdates.java: ## @@ -578,7 +578,8 @@ public synchronized boolean writeFieldUpdates( // IndexWriter.

[PR] Override single byte writes to OutputStreamIndexOutput to remove locking [lucene]

2024-07-04 Thread via GitHub
original-brownbear opened a new pull request, #13543: URL: https://github.com/apache/lucene/pull/13543 Single byte writes to BufferedOutputStream show up pretty hot in indexing benchmarks. We can save the locking overhead introduced by JEP374 by overriding and providing a no-lock fast path.

Re: [PR] Convert more classes to record classes [lucene]

2024-07-04 Thread via GitHub
shubhamvishu commented on PR #13328: URL: https://github.com/apache/lucene/pull/13328#issuecomment-2209345556 Hi @uschindler, Could you please review the current code changes once you get some time and if it looks good maybe we can move this forward? -- This is an automated message from t

Re: [I] Significant drop in recall for 8 bit Scalar Quantizer [lucene]

2024-07-04 Thread via GitHub
naveentatikonda commented on issue #13519: URL: https://github.com/apache/lucene/issues/13519#issuecomment-2209645562 > I disagree. We want `-127, 127`. Otherwise we have to handle the exotic `-128*-128` case when doing vector comparisons and this disallows some nice SIMD optimizations dow

Re: [I] WordBreakSpellChecker.generateBreakUpSuggestions() should do breadth first search [lucene]

2024-07-04 Thread via GitHub
hossman commented on issue #12100: URL: https://github.com/apache/lucene/issues/12100#issuecomment-2209648298 I was reminded of this issue recently, and worked up a patch with the improved algorithm and a new test case that shows how even with a lost of candidate terms in the index, and a l

Re: [PR] Inter-segment I/O concurrency. [lucene]

2024-07-04 Thread via GitHub
github-actions[bot] commented on PR #13509: URL: https://github.com/apache/lucene/pull/13509#issuecomment-2209661389 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Only check liveDocs is null one time in FreqProxTermsWriter.applyDeletes [lucene]

2024-07-04 Thread via GitHub
github-actions[bot] commented on PR #13506: URL: https://github.com/apache/lucene/pull/13506#issuecomment-2209661406 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Minor cleanup in some Facet tests [lucene]

2024-07-04 Thread via GitHub
github-actions[bot] commented on PR #13489: URL: https://github.com/apache/lucene/pull/13489#issuecomment-2209661435 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi