Re: [PR] Multireader Support in Searcher Manager [lucene]

2024-11-07 Thread via GitHub
vigyasharma commented on PR #13976: URL: https://github.com/apache/lucene/pull/13976#issuecomment-2463936596 > any `IndexReader` should work as long as it can `openIfChanged` on itself. Does `MultiReader` implement `openIfChanged()` ? I see a check in `SearcherManager#refreshIfNeeded(

Re: [I] remove refs to people.apache.org/home.apache.org in build [lucene]

2024-11-07 Thread via GitHub
iamsanjay commented on issue #13647: URL: https://github.com/apache/lucene/issues/13647#issuecomment-2463893766 I was trying to set the [luceneutil](https://github.com/mikemccand/luceneutil), ran the script. ``` python3 src/python/setup.py -download ``` It failed on one url where

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-11-07 Thread via GitHub
ShashwatShivam commented on PR #13651: URL: https://github.com/apache/lucene/pull/13651#issuecomment-2463415182 @benwtrent makes sense, I wasn't accounting for the fact that the floating vectors are being stored too. I guess I should have instead asked how to reproduce the 'memory required'

Re: [PR] Remove vector values copy() methods, moving IndexInput.clone() and temp storage into lower-level interfaces [lucene]

2024-11-07 Thread via GitHub
msokolov commented on PR #13872: URL: https://github.com/apache/lucene/pull/13872#issuecomment-2463077893 @ChrisHegarty I think it's expected since we would previously do the allocation once per KnnVectorValues, but now we are doing it once per RandomVectorScorer. I'm working on adding Clos

Re: [PR] DocValuesSkipper implementation in IndexSortSorted [lucene]

2024-11-07 Thread via GitHub
iverase commented on code in PR #13886: URL: https://github.com/apache/lucene/pull/13886#discussion_r1833247096 ## lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java: ## @@ -397,106 +413,80 @@ private boolean matchAll(PointValues points,

Re: [PR] DocValuesSkipper implementation in IndexSortSorted [lucene]

2024-11-07 Thread via GitHub
gsmiller commented on code in PR #13886: URL: https://github.com/apache/lucene/pull/13886#discussion_r1833153027 ## lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java: ## @@ -397,106 +413,80 @@ private boolean matchAll(PointValues points

Re: [PR] Remove vector values copy() methods, moving IndexInput.clone() and temp storage into lower-level interfaces [lucene]

2024-11-07 Thread via GitHub
msokolov commented on PR #13872: URL: https://github.com/apache/lucene/pull/13872#issuecomment-2462152791 Finally got back to this and fixed the aliasing that was happening. I ran some perf tests and don't see significant variance. Still, it's clear we must be doing a lot more allocations h

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-11-07 Thread via GitHub
benwtrent commented on PR #13651: URL: https://github.com/apache/lucene/pull/13651#issuecomment-2462723469 @ShashwatShivam why do you think the index size (total size of all the files) should be smaller? We store the binary quantized vectors and the floating point vectors. So, I woul

Re: [PR] Remove vector values copy() methods, moving IndexInput.clone() and temp storage into lower-level interfaces [lucene]

2024-11-07 Thread via GitHub
msokolov commented on PR #13872: URL: https://github.com/apache/lucene/pull/13872#issuecomment-2462158029 hmm that commit is kind of messed up. Maybe I missed a rebase on main somewhere? I will try to clean up here, but it might require a force-push or a new PR, egad -- This is an autom

Re: [PR] Remove vector values copy() methods, moving IndexInput.clone() and temp storage into lower-level interfaces [lucene]

2024-11-07 Thread via GitHub
ChrisHegarty commented on PR #13872: URL: https://github.com/apache/lucene/pull/13872#issuecomment-2462645675 thanks @msokolov I'll take another look at why the off-heap scorer is allocating so much. -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] Add a Better Binary Quantizer (RaBitQ) format for dense vectors [lucene]

2024-11-07 Thread via GitHub
ShashwatShivam commented on PR #13651: URL: https://github.com/apache/lucene/pull/13651#issuecomment-2462601593 @benwtrent thanks for giving the link to the testing script, it works! One question - the index size it reports is larger than the HNSW index size. For e.g. I was working with a C

Re: [PR] Remove vector values copy() methods, moving IndexInput.clone() and temp storage into lower-level interfaces [lucene]

2024-11-07 Thread via GitHub
msokolov commented on PR #13872: URL: https://github.com/apache/lucene/pull/13872#issuecomment-2462587122 ## heap comparison Here's the output from luceneutil's JFR heap usage summarizer. Clearly a huge amount more allocations for this change. ### float32 mainline ``` PERCENT

Re: [PR] Tessellator: Improve logic when two holes share the same vertex with the polygon [lucene]

2024-11-07 Thread via GitHub
iverase commented on code in PR #13980: URL: https://github.com/apache/lucene/pull/13980#discussion_r1832514627 ## lucene/core/src/test/org/apache/lucene/geo/TestTessellator.java: ## @@ -928,12 +848,42 @@ public void testComplexPolygon55() throws Exception { public void testC

Re: [PR] Tessellator: Improve logic when two holes share the same vertex with the polygon [lucene]

2024-11-07 Thread via GitHub
iverase commented on code in PR #13980: URL: https://github.com/apache/lucene/pull/13980#discussion_r1832514220 ## lucene/core/src/java/org/apache/lucene/geo/Tessellator.java: ## @@ -390,12 +377,108 @@ private static final void eliminateHole( } } + /** Choose a common

Re: [PR] PR 13757 follow-up: add missing with-discountOverlaps Similarity constructor variants, CHANGES.txt entries (#13845) [lucene]

2024-11-07 Thread via GitHub
cpoerschke commented on code in PR #13891: URL: https://github.com/apache/lucene/pull/13891#discussion_r1832502452 ## lucene/CHANGES.txt: ## @@ -47,6 +52,9 @@ API Changes the entire segment should be scored. Subclasses that override the method should instead override its rep

Re: [PR] Tessellator: Improve logic when two holes share the same vertex with the polygon [lucene]

2024-11-07 Thread via GitHub
craigtaverner commented on code in PR #13980: URL: https://github.com/apache/lucene/pull/13980#discussion_r1832435160 ## lucene/core/src/java/org/apache/lucene/geo/Tessellator.java: ## @@ -390,12 +377,108 @@ private static final void eliminateHole( } } + /** Choose a

Re: [PR] DocValuesSkipper implementation in IndexSortSorted [lucene]

2024-11-07 Thread via GitHub
iverase commented on code in PR #13886: URL: https://github.com/apache/lucene/pull/13886#discussion_r1832348844 ## lucene/core/src/java/org/apache/lucene/search/IndexSortSortedNumericDocValuesRangeQuery.java: ## @@ -397,106 +413,80 @@ private boolean matchAll(PointValues points,

[PR] Tessellator: Improve logic when two holes share the same vertex with the polygon [lucene]

2024-11-07 Thread via GitHub
iverase opened a new pull request, #13980: URL: https://github.com/apache/lucene/pull/13980 One of the situation with David Eberly's algorithm for finding a bridge between a hole and outer polygon was failing was the case of a polygon sharing a vertex with the outer polygon. To fixed that