[GitHub] [lucene] jpountz commented on pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-18 Thread GitBox
jpountz commented on PR #11840: URL: https://github.com/apache/lucene/pull/11840#issuecomment-1282004123 > Of course in the case that somebody writes a subclass that only implements the IndexReader variant of rewrite it won't use the searcher. If this user subclass then rewrites subqueries

[GitHub] [lucene] jpountz commented on a diff in pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-18 Thread GitBox
jpountz commented on code in PR #11840: URL: https://github.com/apache/lucene/pull/11840#discussion_r997895966 ## lucene/highlighter/src/test/org/apache/lucene/search/vectorhighlight/TestFieldQuery.java: ## @@ -40,12 +41,23 @@ public class TestFieldQuery extends AbstractTestC

[GitHub] [lucene] jpountz commented on pull request #11832: Added static factory method for loading VectorValues

2022-10-18 Thread GitBox
jpountz commented on PR #11832: URL: https://github.com/apache/lucene/pull/11832#issuecomment-1282020859 @shubhamvishu I'm sorry that your work didn't result in a merged commit, but it feels like it would be better not to merge this change. Thank you again for looking into this! -- This

[GitHub] [lucene] jpountz closed pull request #11832: Added static factory method for loading VectorValues

2022-10-18 Thread GitBox
jpountz closed pull request #11832: Added static factory method for loading VectorValues URL: https://github.com/apache/lucene/pull/11832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] jpountz commented on pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-10-18 Thread GitBox
jpountz commented on PR #11722: URL: https://github.com/apache/lucene/pull/11722#issuecomment-1282060420 @mikemccand You might want to have a look at this change since (I think) you are one of the most familiar ones with the original code. -- This is an automated message from the Apache G

[GitHub] [lucene] jpountz merged pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-10-18 Thread GitBox
jpountz merged PR #11722: URL: https://github.com/apache/lucene/pull/11722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

[GitHub] [lucene] donnerpeter opened a new pull request, #11859: hunspell: speed up GeneratingSuggester by not deserializing non-suggestible roots

2022-10-18 Thread GitBox
donnerpeter opened a new pull request, #11859: URL: https://github.com/apache/lucene/pull/11859 We discard entries with NOSUGGEST (and some other) flags anyway, so let's bail out of processing them at an earlier stage. This speeds up suggestions for relatively short German words by about

[GitHub] [lucene] jpountz commented on pull request #11722: Binary search the entries when all suffixes have the same length in a leaf block.

2022-10-18 Thread GitBox
jpountz commented on PR #11722: URL: https://github.com/apache/lucene/pull/11722#issuecomment-1282305919 I had to revert this change because of test failures, e.g. this seed reproduces on the main branch: ``` gradlew test --tests TestNumericDocValuesUpdates.testSortedIndex -Dtests

[GitHub] [lucene] chatman commented on issue #10342: Integer overflow in total count in grouping results [LUCENE-9302]

2022-10-18 Thread GitBox
chatman commented on issue #10342: URL: https://github.com/apache/lucene/issues/10342#issuecomment-1282339116 I plan to update this PR to merge against main branch shortly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [lucene] stefanvodita commented on pull request #11780: GH#11601: Add ability to compute reader states after refresh

2022-10-18 Thread GitBox
stefanvodita commented on PR #11780: URL: https://github.com/apache/lucene/pull/11780#issuecomment-1282371150 That makes sense. Maybe I'm not addressing the right problem. @gsmiller - as the issue's author, what do you think? -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene] harishankar-gopalan commented on issue #11354: Reuse HNSW graphs when merging segments? [LUCENE-10318]

2022-10-18 Thread GitBox
harishankar-gopalan commented on issue #11354: URL: https://github.com/apache/lucene/issues/11354#issuecomment-1282390590 Hi @jmazanec15, I had a quick doubt. Currently how are segment merges happening in Lucene for the HNSW graph ? Is the graph being reconstructed from scratch ? -- This

[GitHub] [lucene] msokolov commented on a diff in pull request #11852: Luke Webapp

2022-10-18 Thread GitBox
msokolov commented on code in PR #11852: URL: https://github.com/apache/lucene/pull/11852#discussion_r998344033 ## lucene/luke/src/java/org/apache/lucene/luke/app/web/LukeWebMain.java: ## @@ -17,31 +17,78 @@ package org.apache.lucene.luke.app.web; +import java.net.InetSocke

[GitHub] [lucene] benwtrent opened a new pull request, #11860: GITHUB-11830 Better optimize storage for vector connections

2022-10-18 Thread GitBox
benwtrent opened a new pull request, #11860: URL: https://github.com/apache/lucene/pull/11860 Vector search is much faster when the graph can fit in memory. Consequently, improvements in vector storage can translate to faster searches on larger graphs. One area of size reduction is n

[GitHub] [lucene] shahrs87 commented on pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-10-18 Thread GitBox
shahrs87 commented on PR #907: URL: https://github.com/apache/lucene/pull/907#issuecomment-1282791420 @jpountz Can you please review this patch again? Thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] jtibshirani opened a new pull request, #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox
jtibshirani opened a new pull request, #11861: URL: https://github.com/apache/lucene/pull/11861 When reading large segments, the vectors format can fail with a validation error: ``` java.lang.IllegalStateException: Vector data length 3070061568 not matching size=999369 * dim=7

[GitHub] [lucene] jtibshirani commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox
jtibshirani commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1282872442 As a note, this only touches the read codepath and has no effect on data format, so it's safe to fix the current codec directly. I tried to add a test but didn't see a good way

[GitHub] [lucene] iverase commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox
iverase commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1282966111 I have to change this test not too long ago to index 4B points instead of 2B to trigger a bug as well. Maybe something like that as a Monster test might work for you? : https://git

[GitHub] [lucene] jtibshirani commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-18 Thread GitBox
jtibshirani commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1282994720 Thanks @iverase ! Do the monster tests get run regularly (perhaps during nightly builds)? -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11860: GITHUB-11830 Better optimize storage for vector connections

2022-10-18 Thread GitBox
jtibshirani commented on code in PR #11860: URL: https://github.com/apache/lucene/pull/11860#discussion_r998757995 ## lucene/core/src/java/org/apache/lucene/codecs/lucene95/Lucene95HnswVectorsReader.java: ## @@ -0,0 +1,505 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] [lucene] stevenschlansker commented on pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox
stevenschlansker commented on PR #11822: URL: https://github.com/apache/lucene/pull/11822#issuecomment-1283082251 I updated this PR to rename the field to include `Ms`. I added a test case for both no timeout (0), and 1000ms. I verified the test fails (doesn't terminate) without the new c

[GitHub] [lucene] zhaih commented on a diff in pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox
zhaih commented on code in PR #11822: URL: https://github.com/apache/lucene/pull/11822#discussion_r998806414 ## lucene/CHANGES.txt: ## @@ -44,6 +44,8 @@ New Features * LUCENE-10626 Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansion and stem

[GitHub] [lucene] stevenschlansker commented on a diff in pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox
stevenschlansker commented on code in PR #11822: URL: https://github.com/apache/lucene/pull/11822#discussion_r998813121 ## lucene/CHANGES.txt: ## @@ -44,6 +44,8 @@ New Features * LUCENE-10626 Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansi

[GitHub] [lucene] zhaih merged pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox
zhaih merged PR #11822: URL: https://github.com/apache/lucene/pull/11822 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] zhaih closed issue #11674: PrimaryNode close waits for replicas to close, but there is no guarantee they ever will [LUCENE-10638]

2022-10-18 Thread GitBox
zhaih closed issue #11674: PrimaryNode close waits for replicas to close, but there is no guarantee they ever will [LUCENE-10638] URL: https://github.com/apache/lucene/issues/11674 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [lucene] zhaih commented on a diff in pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-18 Thread GitBox
zhaih commented on code in PR #11840: URL: https://github.com/apache/lucene/pull/11840#discussion_r998828034 ## lucene/highlighter/src/test/org/apache/lucene/search/vectorhighlight/TestFieldQuery.java: ## @@ -40,12 +41,23 @@ public class TestFieldQuery extends AbstractTestCas

[GitHub] [lucene] zhaih commented on pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-18 Thread GitBox
zhaih commented on PR #11822: URL: https://github.com/apache/lucene/pull/11822#issuecomment-1283191722 I merged it but seems there're test failure ``` org.apache.lucene.index.TestIndexFileDeleter > test suite's output saved to /home/runner/work/lucene/lucene/lucene/core/build/test-resu