Re: [PR] Make LRUQueryCache respect Accountable queries on eviction and consisten… [lucene]

2024-07-10 Thread via GitHub
jaebongim commented on PR #12614: URL: https://github.com/apache/lucene/pull/12614#issuecomment-013014 @gtroitskiy @romseygeek Is the bug fixed on 8.12 Elasticseach? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Check whether liveDoc is null out of loop in Weight.scoreAll [lucene]

2024-07-10 Thread via GitHub
vsop-479 commented on PR #13557: URL: https://github.com/apache/lucene/pull/13557#issuecomment-2221949176 @jpountz I measured it with luceneutil on wikimedium10m: TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value

Re: [PR] Add a `targetSearchConcurrency` parameter to `LogMergePolicy`. [lucene]

2024-07-10 Thread via GitHub
github-actions[bot] commented on PR #13517: URL: https://github.com/apache/lucene/pull/13517#issuecomment-2221754825 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Add HnswGraphBuilder.getCompletedGraph() and record completed state [lucene]

2024-07-10 Thread via GitHub
msokolov commented on code in PR #13561: URL: https://github.com/apache/lucene/pull/13561#discussion_r1673111785 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java: ## @@ -84,7 +88,8 @@ public OnHeapHnswGraph build(int maxOrd) throws IOException

Re: [PR] Add HnswGraphBuilder.getCompletedGraph() and record completed state [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on code in PR #13561: URL: https://github.com/apache/lucene/pull/13561#discussion_r1672957009 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -156,14 +157,20 @@ public OnHeapHnswGraph build(int maxOrd) throws IOException {

Re: [PR] Add HnswGraphBuilder.getCompletedGraph() and record completed state [lucene]

2024-07-10 Thread via GitHub
msokolov commented on code in PR #13561: URL: https://github.com/apache/lucene/pull/13561#discussion_r1672846447 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java: ## @@ -84,7 +88,8 @@ public OnHeapHnswGraph build(int maxOrd) throws IOException

Re: [PR] Add HnswGraphBuilder.getCompletedGraph() and record completed state [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on code in PR #13561: URL: https://github.com/apache/lucene/pull/13561#discussion_r1672837709 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswConcurrentMergeBuilder.java: ## @@ -84,7 +88,8 @@ public OnHeapHnswGraph build(int maxOrd) throws IOException

Re: [PR] Feature/scalar quantized off heap scoring [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on PR #13497: URL: https://github.com/apache/lucene/pull/13497#issuecomment-2221274152 To verify it wasn't some weird artifact in my code, I slightly changed it to where my execution path always reads the vectors on-heap and then wraps them in a memorysegment. Now JDK22

[PR] Add HnswGraphBuilder.getCompletedGraph() and record completed state [lucene]

2024-07-10 Thread via GitHub
msokolov opened a new pull request, #13561: URL: https://github.com/apache/lucene/pull/13561 See https://github.com/apache/lucene/issues/12627 for context -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Feature/scalar quantized off heap scoring [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on PR #13497: URL: https://github.com/apache/lucene/pull/13497#issuecomment-2221250315 @ChrisHegarty have you seen a significant performance regression on MemorySegments & JDK22? Doing some testing, I updated my performance testing for this PR to use JDK22 and now

Re: [PR] SparseFixedBitSet#firstDoc: reduce number of `indices` iterations for a bit set that is not fully built yet. [lucene]

2024-07-10 Thread via GitHub
msokolov commented on PR #13559: URL: https://github.com/apache/lucene/pull/13559#issuecomment-2221245332 I wonder if `DocIdSetBuilder` would help? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] HnwsGraph creates disconnected components [lucene]

2024-07-10 Thread via GitHub
msokolov commented on issue #12627: URL: https://github.com/apache/lucene/issues/12627#issuecomment-2221207888 I'd like to take a stab at the "second pass" idea for patching up disconnected graph components. As a first step I think we ought to add state to the `HnswGraphBuilder` in order to

Re: [PR] Group memory arenas by segment to reduce costly `Arena.close()` [lucene]

2024-07-10 Thread via GitHub
uschindler commented on code in PR #13555: URL: https://github.com/apache/lucene/pull/13555#discussion_r1672768505 ## lucene/core/src/java21/org/apache/lucene/store/GroupedArena.java: ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] Feature/scalar quantized off heap scoring [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on PR #13497: URL: https://github.com/apache/lucene/pull/13497#issuecomment-2221176325 Ok, I double checked, and indeed, half-byte is way slower when reading directly from memory segments instead of reading on heap. [memsegment_vs_baseline.zip](https://github.com/use

Re: [I] WordBreakSpellChecker.generateBreakUpSuggestions() should do breadth first search [lucene]

2024-07-10 Thread via GitHub
hossman closed issue #12100: WordBreakSpellChecker.generateBreakUpSuggestions() should do breadth first search URL: https://github.com/apache/lucene/issues/12100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-10 Thread via GitHub
javanna commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1672757642 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -328,42 +336,65 @@ protected LeafSlice[] slices(List leaves) { /** Static method to segr

Re: [PR] Minor cleanup in some Facet tests [lucene]

2024-07-10 Thread via GitHub
stefanvodita commented on PR #13489: URL: https://github.com/apache/lucene/pull/13489#issuecomment-2221008091 I went ahead and merged since this PR had been pending for a few weeks. Thank you @slow-J for your contribution and @mikemccand for reviewing! -- This is an automated message from

Re: [I] NRT add configurable commitData for Custom security verification [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on issue #13044: URL: https://github.com/apache/lucene/issues/13044#issuecomment-2220993446 I see you have opened a PR to add this with very little context and use case. Do you mind further describing what you are trying to achieve and why? -- This is an automated mess

Re: [PR] Nrt snapshot 9x [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on PR #13533: URL: https://github.com/apache/lucene/pull/13533#issuecomment-2220988485 @dianjifzm I went ahead and closed this PR. I am guessing this is a port forward of the other PR which also has no description. Do you mind adding some context directly in the P

Re: [PR] Nrt snapshot 9x [lucene]

2024-07-10 Thread via GitHub
benwtrent closed pull request #13533: Nrt snapshot 9x URL: https://github.com/apache/lucene/pull/13533 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: i

Re: [PR] Minor cleanup in some Facet tests [lucene]

2024-07-10 Thread via GitHub
stefanvodita merged PR #13489: URL: https://github.com/apache/lucene/pull/13489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-10 Thread via GitHub
shubhamvishu commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1672505812 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -328,42 +336,65 @@ protected LeafSlice[] slices(List leaves) { /** Static method to

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-10 Thread via GitHub
shubhamvishu commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1672505812 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -328,42 +336,65 @@ protected LeafSlice[] slices(List leaves) { /** Static method to

Re: [PR] Replace AtomicLong with LongAdder in HitsThresholdChecker [lucene]

2024-07-10 Thread via GitHub
benwtrent commented on PR #13546: URL: https://github.com/apache/lucene/pull/13546#issuecomment-2220694253 @shubhamvishu I do not know off hand which benchmarks should be done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Reduce heap usage for knn index writers [lucene]

2024-07-10 Thread via GitHub
benwtrent merged PR #13538: URL: https://github.com/apache/lucene/pull/13538 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] [9.x] Use a confined Arena for IOContext.READONCE (#13535) [lucene]

2024-07-10 Thread via GitHub
ChrisHegarty merged PR #13560: URL: https://github.com/apache/lucene/pull/13560 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Use `IndexInput#prefetch` for terms dictionary lookups. [lucene]

2024-07-10 Thread via GitHub
jpountz merged PR #13359: URL: https://github.com/apache/lucene/pull/13359 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [I] Pruning of estimating the point value count since BooleanScorerSupplier [lucene]

2024-07-10 Thread via GitHub
kkewwei commented on issue #13554: URL: https://github.com/apache/lucene/issues/13554#issuecomment-2220514317 @jpountz, thank you for reply. I will do benchmark if it's useful. -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] Pruning of estimating the point value count since BooleanScorerSupplier [lucene]

2024-07-10 Thread via GitHub
kkewwei commented on issue #13554: URL: https://github.com/apache/lucene/issues/13554#issuecomment-2220507498 @jpountz, thank you for reply. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] [9.x] Use a confined Arena for IOContext.READONCE (#13535) [lucene]

2024-07-10 Thread via GitHub
ChrisHegarty commented on code in PR #13560: URL: https://github.com/apache/lucene/pull/13560#discussion_r1672214121 ## lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java: ## @@ -141,4 +146,57 @@ public void testWithRandom() throws Exception { } }

Re: [PR] Replace AtomicLong with LongAdder in HitsThresholdChecker [lucene]

2024-07-10 Thread via GitHub
shubhamvishu commented on PR #13546: URL: https://github.com/apache/lucene/pull/13546#issuecomment-222032 Makes sense @benwtrent! For `BufferedUpdatesStream` as its on the index side we should check on the indexing time and not the regular luceneutil benchmarks to check the QPS? --

Re: [PR] [9.x] Use a confined Arena for IOContext.READONCE (#13535) [lucene]

2024-07-10 Thread via GitHub
uschindler commented on code in PR #13560: URL: https://github.com/apache/lucene/pull/13560#discussion_r1672135733 ## lucene/core/src/test/org/apache/lucene/store/TestMMapDirectory.java: ## @@ -141,4 +146,57 @@ public void testWithRandom() throws Exception { } } }

Re: [PR] GITHUB#13175: Stop double-checking priority queue inserts in some FacetCount classes [lucene]

2024-07-10 Thread via GitHub
slow-J commented on PR #13488: URL: https://github.com/apache/lucene/pull/13488#issuecomment-2220297663 Thanks @mikemccand I think the diff for 9.12 looks good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] GITHUB#13175: Stop double-checking priority queue inserts in some FacetCount classes [lucene]

2024-07-10 Thread via GitHub
mikemccand commented on PR #13488: URL: https://github.com/apache/lucene/pull/13488#issuecomment-2220286531 Thanks @slow-J -- I just backported to 9.12 as well. I had to resolve a few conflicts, maybe have a peek and see if I did it correctly? -- This is an automated message from the Apa

[PR] SparseFixedBitSet#firstDoc: reduce number of `indices` iterations for a bit set that is not fully built yet. [lucene]

2024-07-10 Thread via GitHub
epotyom opened a new pull request, #13559: URL: https://github.com/apache/lucene/pull/13559 In SparseFixedBitSet.firstDoc, instead of iterating though the entire indices array until non-zero value is found, keep track of max updated index. Use case where it improves performance: 1.

[PR] Fix testAddDocumentOnDiskFull to handle IllegalStateException from IndexWriter#close [lucene]

2024-07-10 Thread via GitHub
easyice opened a new pull request, #13558: URL: https://github.com/apache/lucene/pull/13558 This issue is similar to https://github.com/apache/lucene/issues/11755, but it occurs in `IndexWriter#close` and also has about half of the time of reproduction. ``` java.lang.I

Re: [PR] QueryRescorer: Use original order by default for same-score items rather than sorting by docId [lucene]

2024-07-10 Thread via GitHub
Willdotwhite commented on PR #13510: URL: https://github.com/apache/lucene/pull/13510#issuecomment-2219974712 Morning @andywebb1975 and @jpountz - I'm a bit late to the discussion, but I'm interested to be involved! I'd be happy to look into this with Andy if the rewrite is the way to

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-10 Thread via GitHub
ChrisHegarty merged PR #13535: URL: https://github.com/apache/lucene/pull/13535 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

Re: [PR] Check whether liveDoc is null out of loop in Weight.scoreAll [lucene]

2024-07-10 Thread via GitHub
vsop-479 commented on PR #13557: URL: https://github.com/apache/lucene/pull/13557#issuecomment-2219865957 > Can you check if this helps with luceneutil on wikimedium10m or wikibigall? Sure, I will do that soon. -- This is an automated message from the Apache Git Service. To respond

Re: [PR] Check whether liveDoc is null out of loop in Weight.scoreAll [lucene]

2024-07-10 Thread via GitHub
jpountz commented on PR #13557: URL: https://github.com/apache/lucene/pull/13557#issuecomment-2219848841 Can you check if this helps with luceneutil on wikimedium10m or wikibigall? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] Lookup next when current doc is deleted in PerThreadPKLookup.lookup [lucene]

2024-07-10 Thread via GitHub
jpountz merged PR #13556: URL: https://github.com/apache/lucene/pull/13556 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Lookup next when current doc is deleted in PerThreadPKLookup.lookup [lucene]

2024-07-10 Thread via GitHub
vsop-479 commented on code in PR #13556: URL: https://github.com/apache/lucene/pull/13556#discussion_r1671812467 ## lucene/core/src/test/org/apache/lucene/index/TestTermsEnum.java: ## @@ -998,6 +999,43 @@ public void testCommonPrefixTerms() throws Exception { d.close();

Re: [I] Merge on Commit: No merges if new data is flushed (but not committed) [lucene]

2024-07-10 Thread via GitHub
jpountz commented on issue #13537: URL: https://github.com/apache/lucene/issues/13537#issuecomment-2219745791 What version are you using? We fixed a similar problem in version 9.9, I wonder if the problem that you are reporting is the same one or a new one: https://github.com/apache/lucene/

Re: [I] Pruning of estimating the point value count since BooleanScorerSupplier [lucene]

2024-07-10 Thread via GitHub
jpountz commented on issue #13554: URL: https://github.com/apache/lucene/issues/13554#issuecomment-2219711472 The idea makes sense to me, but I worry that it wouldn't look good API-wise. I also imagine that the gains would be lower than in #13199 since `Weight#scorerSupplier` is called one