Re: [PR] Only check liveDocs is null one time in FreqProxTermsWriter.applyDeletes [lucene]

2024-07-05 Thread via GitHub
vsop-479 commented on PR #13506: URL: https://github.com/apache/lucene/pull/13506#issuecomment-2210318818 @mikemccand Please take a look when you get a chance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Use a confined Arena for IOContext.READONCE [lucene]

2024-07-05 Thread via GitHub
uschindler commented on PR #13535: URL: https://github.com/apache/lucene/pull/13535#issuecomment-2210600762 I am not really sure how we should proceed. The `new SegmentReader()` which is closed a few lines below looks like a valid use case for READ_ONLY, but SegmentReader creates clones. -

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-05 Thread via GitHub
shubhamvishu commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2210786696 Nice to see this PR @javanna! I know it might be too early to ask(as changes are not yet consolidated), but curious if we have any early benchmarking numbers when intra concurrency

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-05 Thread via GitHub
mikemccand commented on PR #13542: URL: https://github.com/apache/lucene/pull/13542#issuecomment-2210815304 Thank you for tackling this @javanna! This is long overdue ... it's crazy that a fully optimized (`forceMerge(1)`) index removes all intra-query concurrency. > IndexSearcher#c

Re: [PR] WIP: draft of intra segment concurrency [lucene]

2024-07-05 Thread via GitHub
mikemccand commented on code in PR #13542: URL: https://github.com/apache/lucene/pull/13542#discussion_r1666772840 ## lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java: ## @@ -328,42 +336,65 @@ protected LeafSlice[] slices(List leaves) { /** Static method to s

Re: [PR] fixing a wrong check of RollingCharBuffer [lucene]

2024-07-05 Thread via GitHub
github-actions[bot] commented on PR #13512: URL: https://github.com/apache/lucene/pull/13512#issuecomment-2211531640 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Multi-Vector support for HNSW search [lucene]

2024-07-05 Thread via GitHub
vigyasharma commented on PR #13525: URL: https://github.com/apache/lucene/pull/13525#issuecomment-2211598044 > As for adding new information to the FieldInfo, another valid option is making it configurable directly on the format and not update fieldinfo. We create multi-vector FieldWr

Re: [PR] Multi-Vector support for HNSW search [lucene]

2024-07-05 Thread via GitHub
vigyasharma commented on code in PR #13525: URL: https://github.com/apache/lucene/pull/13525#discussion_r1667235770 ## lucene/core/src/java/org/apache/lucene/index/FieldInfo.java: ## @@ -92,6 +97,8 @@ public FieldInfo( int vectorDimension, VectorEncoding vectorEnco

[PR] Replace AtomicLong with LongAdder in HitsThresholdChecker [lucene]

2024-07-05 Thread via GitHub
original-brownbear opened a new pull request, #13546: URL: https://github.com/apache/lucene/pull/13546 The value for the global count is incremented a lot more than it is read, the space overhead of LongAdder seems irrelevant => lets use LongAdder. The performance gain from using it is the

Re: [PR] Replace AtomicLong with LongAdder in HitsThresholdChecker [lucene]

2024-07-05 Thread via GitHub
original-brownbear commented on PR #13546: URL: https://github.com/apache/lucene/pull/13546#issuecomment-2211625398 The results are easily explained by the code change, the top methods in profiling go from: ``` 10.79%129797 org.apache.lucene.facet.sortedset.SortedSe