Re: [PR] Performance improvements to use RWLock to access LRUQueryCache [lucene]

2024-05-10 Thread via GitHub
boicehuang commented on code in PR #13306: URL: https://github.com/apache/lucene/pull/13306#discussion_r1584480939 ## lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java: ## @@ -628,7 +632,7 @@ private class LeafCache implements Accountable { LeafCache(Object

Re: [PR] Binary search all terms. [lucene]

2024-05-10 Thread via GitHub
github-actions[bot] commented on PR #13192: URL: https://github.com/apache/lucene/pull/13192#issuecomment-2105402311 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Reduce duplication in taxonomy facets; always do counts [lucene]

2024-05-10 Thread via GitHub
stefanvodita commented on PR #12966: URL: https://github.com/apache/lucene/pull/12966#issuecomment-2105340184 I was just working on it today actually and finally got it in shape: #13358. Sorry it took so long! -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Backport to 9x: Reduce duplication in taxonomy facets; always do counts #12966x [lucene]

2024-05-10 Thread via GitHub
stefanvodita commented on PR #13358: URL: https://github.com/apache/lucene/pull/13358#issuecomment-2105338901 Despite the "annoying" bits in the description, I don't expect this backport to be controversial, but reviews are welcome! I plan to wait over the weekend and then merge. --

Re: [PR] Fix default flat vector scorer supplier sharing backing array [lucene]

2024-05-10 Thread via GitHub
ChrisHegarty merged PR #13355: URL: https://github.com/apache/lucene/pull/13355 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucen

[I] Enhance DisjunctionMaxQuery explanation to include details in case there was no match [lucene]

2024-05-10 Thread via GitHub
AndreyBozhko opened a new issue, #13357: URL: https://github.com/apache/lucene/issues/13357 ### Description Currently, if the DisjunctionMaxQuery is used and the document did not match, the explanation from DisjunctionMaxWeight#explain only says `No matching clause` with no additiona

Re: [PR] gh-13147: use dense bit-encoding for frequent terms [lucene]

2024-05-10 Thread via GitHub
msokolov commented on PR #13153: URL: https://github.com/apache/lucene/pull/13153#issuecomment-2105035929 > +1 -- did that show any measurable performance change? Well, sort of -- I did index months to get some dense postings field and added "month tasks" by tacking on `+month:Mar` an

Re: [PR] gh-13147: use dense bit-encoding for frequent terms [lucene]

2024-05-10 Thread via GitHub
mikemccand commented on PR #13153: URL: https://github.com/apache/lucene/pull/13153#issuecomment-2104944120 > Perhaps if we index Month and Year as docs-only fields we would see some impact on queries with those as filters? +1 -- did that show any measurable performance change?

Re: [PR] Fix default flat vector scorer supplier sharing backing array [lucene]

2024-05-10 Thread via GitHub
ChrisHegarty commented on code in PR #13355: URL: https://github.com/apache/lucene/pull/13355#discussion_r1596969665 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -101,7 +102,9 @@ private ByteScoringSupplier( @Override publ

Re: [PR] Fix default flat vector scorer supplier sharing backing array [lucene]

2024-05-10 Thread via GitHub
ChrisHegarty commented on code in PR #13355: URL: https://github.com/apache/lucene/pull/13355#discussion_r1596969665 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -101,7 +102,9 @@ private ByteScoringSupplier( @Override publ

Re: [I] [DISCUSS] Identifying Gaps in Lucene’s Faceting [lucene]

2024-05-10 Thread via GitHub
mikemccand commented on issue #12553: URL: https://github.com/apache/lucene/issues/12553#issuecomment-2104917713 +1 to cross-fertilize between OpenSearch's strong aggregations and Lucene's mostly-limited-to-counting (?) facets. If we cross-fertilize carefully, Lucene could provide the stro

Re: [PR] Fix default flat vector scorer supplier sharing backing array [lucene]

2024-05-10 Thread via GitHub
ChrisHegarty commented on code in PR #13355: URL: https://github.com/apache/lucene/pull/13355#discussion_r1596969665 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -101,7 +102,9 @@ private ByteScoringSupplier( @Override publ

Re: [PR] Reduce duplication in taxonomy facets; always do counts [lucene]

2024-05-10 Thread via GitHub
mikemccand commented on PR #12966: URL: https://github.com/apache/lucene/pull/12966#issuecomment-2104896366 Now that #12408 was backported in https://github.com/apache/lucene/pull/13300 can we now backport this to 9.x? Or was it already done in an un-linked PR or so? Remembering to

Re: [PR] Advoid the use of ImpactsDISI when no minimum competitive score has been set [lucene]

2024-05-10 Thread via GitHub
jpountz merged PR #13343: URL: https://github.com/apache/lucene/pull/13343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Fix default flat vector scorer supplier sharing backing array [lucene]

2024-05-10 Thread via GitHub
benwtrent commented on code in PR #13355: URL: https://github.com/apache/lucene/pull/13355#discussion_r1596942824 ## lucene/core/src/java/org/apache/lucene/codecs/hnsw/DefaultFlatVectorScorer.java: ## @@ -101,7 +102,9 @@ private ByteScoringSupplier( @Override public

[PR] Fix default flat vector scorer supplier sharing backing array [lucene]

2024-05-10 Thread via GitHub
ChrisHegarty opened a new pull request, #13355: URL: https://github.com/apache/lucene/pull/13355 This commit fixes an in issue in the default flat vector scorer supplier whereby subsequent scorers created by the supplier can affect previously created scorers. The issue is that we're

Re: [PR] Reuse BitSet when there are deleted documents in the index instead of creating new BitSet [lucene]

2024-05-10 Thread via GitHub
mikemccand commented on PR #12857: URL: https://github.com/apache/lucene/pull/12857#issuecomment-2104850564 I don't fully understand this change, but it looks like it is stalled on proving it shows lower CPU and/or heap/GC load? Could we benchmark this change using luceneutil? It's a

Re: [PR] Avoid SegmentTermsEnumFrame reload block. [lucene]

2024-05-10 Thread via GitHub
mikemccand commented on PR #13253: URL: https://github.com/apache/lucene/pull/13253#issuecomment-2104801856 Thanks @vsop-479 I will try to re-engage here soon! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] Advoid the use of ImpactsDISI when no minimum competitive score has been set [lucene]

2024-05-10 Thread via GitHub
zhongshanhao commented on PR #13343: URL: https://github.com/apache/lucene/pull/13343#issuecomment-2104657380 @jpountz Yes. I run the benchmark again with the latest version of the change. The benchmark on wikimediumall is as follow: ``` TaskQPS base

Re: [PR] Advoid the use of ImpactsDISI when no minimum competitive score has been set [lucene]

2024-05-10 Thread via GitHub
zhongshanhao commented on PR #13343: URL: https://github.com/apache/lucene/pull/13343#issuecomment-2104645203 @jpountz yes. I run the benchmark again with the latest version of the change. The benchmark on `wikimediumall` is as follow: -- This is an automated message from the Apache Git S

Re: [PR] Advoid the use of ImpactsDISI when no minimum competitive score has been set [lucene]

2024-05-10 Thread via GitHub
jpountz commented on PR #13343: URL: https://github.com/apache/lucene/pull/13343#issuecomment-2104599812 @zhongshanhao Are you still observing a speedup with the latest version of the change? I was planning on merging once you confirmed this. -- This is an automated message from the Apach

Re: [PR] Advoid the use of ImpactsDISI when no minimum competitive score has been set [lucene]

2024-05-10 Thread via GitHub
zhongshanhao commented on PR #13343: URL: https://github.com/apache/lucene/pull/13343#issuecomment-2104580780 @jpountz Can you help me merge the PR? I can't merge this PR because I don't have write access to this repository. :) -- This is an automated message from the Apache Git Serv

Re: [PR] Add IndexInput#prefetch. [lucene]

2024-05-10 Thread via GitHub
jpountz merged PR #13337: URL: https://github.com/apache/lucene/pull/13337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Performance improvements to use RWLock to access LRUQueryCache [lucene]

2024-05-10 Thread via GitHub
benwtrent merged PR #13306: URL: https://github.com/apache/lucene/pull/13306 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Performance improvements to use RWLock to access LRUQueryCache [lucene]

2024-05-10 Thread via GitHub
boicehuang commented on PR #13306: URL: https://github.com/apache/lucene/pull/13306#issuecomment-2104395723 > I think this is ready for merging. I can do the merging, but won't back port to 9x until we see nightlies. They might catch something we missed. > > @boicehuang could you add