Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
ChrisHegarty commented on PR #13998: URL: https://github.com/apache/lucene/pull/13998#issuecomment-2482415401 > @ChrisHegarty this will be a very useful thing. Indeed. > Can we also figure out how much data is loaded with this API? So lets say an IndexInput is 30GB and only 10G

Re: [PR] Introduces IndexInput#updateReadAdvice to change the ReadAdvice while merging vectors [lucene]

2024-11-18 Thread via GitHub
shatejas commented on PR #13985: URL: https://github.com/apache/lucene/pull/13985#issuecomment-2484035773 ### Benchmarks Setup 1 - Opensearch cluster Ran with [opensearch benchmarks](https://github.com/opensearch-project/opensearch-benchmark) Total data nodes - 3

Re: [I] KnnFloatVectorQuery#toString should show the filter [lucene]

2024-11-18 Thread via GitHub
benwtrent commented on issue #13983: URL: https://github.com/apache/lucene/issues/13983#issuecomment-2484000904 This is now fixed: https://github.com/apache/lucene/pull/13990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] KnnFloatVectorQuery#toString should show the filter [lucene]

2024-11-18 Thread via GitHub
benwtrent closed issue #13983: KnnFloatVectorQuery#toString should show the filter URL: https://github.com/apache/lucene/issues/13983 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]

2024-11-18 Thread via GitHub
benwtrent merged PR #13990: URL: https://github.com/apache/lucene/pull/13990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] Avoid allocating liveDocs for no soft-deletes (#13895) (#13903) [lucene]

2024-11-18 Thread via GitHub
dnhatn merged PR #14001: URL: https://github.com/apache/lucene/pull/14001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Parse escaped brackets and spaces in range queries [lucene]

2024-11-18 Thread via GitHub
benchaplin commented on PR #13887: URL: https://github.com/apache/lucene/pull/13887#issuecomment-2483785234 @dweiss you mentioned in my previous PR that I should do some randomized testing. I did, which helped me find the "Addition of "\\" in the negation set" requirement. However I just tr

Re: [PR] Only consider clauses whose cost is less than the lead cost to compute block boundaries in WANDScorer. [lucene]

2024-11-18 Thread via GitHub
jpountz commented on PR #14000: URL: https://github.com/apache/lucene/pull/14000#issuecomment-2483629696 Will do, thanks @benwtrent! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Update lastDoc in ScoreCachingWrappingScorer [lucene]

2024-11-18 Thread via GitHub
jpountz commented on code in PR #13987: URL: https://github.com/apache/lucene/pull/13987#discussion_r1846948010 ## lucene/core/src/test/org/apache/lucene/search/TestScoreCachingWrappingScorer.java: ## @@ -157,4 +157,40 @@ public void testGetScores() throws Exception { ir.cl

Re: [PR] Adding filter to the toString() method of KnnFloatVectorQuery [lucene]

2024-11-18 Thread via GitHub
viswanathk commented on PR #13990: URL: https://github.com/apache/lucene/pull/13990#issuecomment-2483545431 > I think a `CHANGES` entry is in order. This seems like a nice little bug fix to aid folks in debugging issues. > > @viswanathk once you add the changes entry, I can merge and

Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
rmuir commented on PR #13998: URL: https://github.com/apache/lucene/pull/13998#issuecomment-2483336503 Also for debugging these issues, you can get this information at non-java level using `fincore` from util-linux, which is probably on any machine: ``` myindexdir$ fincore --output

Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
rmuir commented on PR #13998: URL: https://github.com/apache/lucene/pull/13998#issuecomment-2483230262 You would need to call `mincore` or something yourself. I can't remember, but the native access may already be plumbed. for non-mmapped i/o you can do similar with syscalls such as `

Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
rmuir commented on PR #13998: URL: https://github.com/apache/lucene/pull/13998#issuecomment-2483324513 > Yeah, we can look at how to call `mincore`, and it might not be that much of a lift with the existing plumbing. Maybe something can look at as a follow up? I'm really trying to get to a

Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
ChrisHegarty commented on code in PR #13998: URL: https://github.com/apache/lucene/pull/13998#discussion_r1846722542 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -406,6 +406,14 @@ void advise(long offset, long length, IOConsumer advice)

Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
ChrisHegarty commented on PR #13998: URL: https://github.com/apache/lucene/pull/13998#issuecomment-2483288393 Yeah, we can look at how to call `mincore`, and it might not be that much of a lift with the existing plumbing. Maybe something can look at as a follow up? I'm really trying to ge

Re: [PR] Add IndexInput isLoaded [lucene]

2024-11-18 Thread via GitHub
rmuir commented on code in PR #13998: URL: https://github.com/apache/lucene/pull/13998#discussion_r1846706623 ## lucene/core/src/java21/org/apache/lucene/store/MemorySegmentIndexInput.java: ## @@ -406,6 +406,14 @@ void advise(long offset, long length, IOConsumer advice) throws

Re: [I] Unable to Tessellate shape for a valid Polygon according to GDAL/OGR and PostGIS [lucene]

2024-11-18 Thread via GitHub
garaud commented on issue #13841: URL: https://github.com/apache/lucene/issues/13841#issuecomment-2482280736 Thank you very much @iverase for your work! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Only consider clauses whose cost is less than the lead cost to compute block boundaries in WANDScorer. [lucene]

2024-11-18 Thread via GitHub
jpountz commented on PR #14000: URL: https://github.com/apache/lucene/pull/14000#issuecomment-2482613789 Here are the luceneutil results for filtering tasks: ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff p-va

[PR] Only consider clauses whose cost is less than the lead cost to compute block boundaries in WANDScorer. [lucene]

2024-11-18 Thread via GitHub
jpountz opened a new pull request, #14000: URL: https://github.com/apache/lucene/pull/14000 WANDScorer implements block-max WAND and needs to recompute score upper bounds whenever it moves to a different block. Thus it's important for these blocks to be large enough to avoid re-computing sc

Re: [PR] Speed up top-k retrieval of filtered disjunctions a bit. [lucene]

2024-11-18 Thread via GitHub
jpountz commented on PR #13996: URL: https://github.com/apache/lucene/pull/13996#issuecomment-2482516359 I ran with more tasks to confirm it's generally helpful: ``` TaskQPS baseline StdDevQPS my_modified_version StdDevPct diff

Re: [PR] Speed up top-k retrieval on filtered conjunctions. [lucene]

2024-11-18 Thread via GitHub
jpountz commented on PR #13994: URL: https://github.com/apache/lucene/pull/13994#issuecomment-2482217169 Thanks @benwtrent ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.