[GitHub] [lucene] shubhamvishu commented on pull request #11832: Added static factory method for loading VectorValues

2022-10-19 Thread GitBox
shubhamvishu commented on PR #11832: URL: https://github.com/apache/lucene/pull/11832#issuecomment-1283725512 No problem at all. I get it it makes sense to not do this right now. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [lucene] iverase commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
iverase commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1283787674 > Do the monster tests get run regularly (perhaps during nightly builds)? I thought they were running weekly or monthly but checked Apache and Policeman CI and they don't seem to b

[GitHub] [lucene] rmuir commented on pull request #11847: Add a method allowing canonical strings to be returned from DataInput

2022-10-19 Thread GitBox
rmuir commented on PR #11847: URL: https://github.com/apache/lucene/pull/11847#issuecomment-1283793590 Sorry dsmiley, clearly you want this change, but I don't have to justify my hate for memory leaks. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [lucene] rmuir commented on issue #11853: Make CJKAnalyzer that use Trigram instead of Bigram

2022-10-19 Thread GitBox
rmuir commented on issue #11853: URL: https://github.com/apache/lucene/issues/11853#issuecomment-1283800342 If you want to do things like trigrams, just use n-gram tokenizer instead... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [lucene] rmuir commented on a diff in pull request #11856: Fix nanos to millis conversion for tests

2022-10-19 Thread GitBox
rmuir commented on code in PR #11856: URL: https://github.com/apache/lucene/pull/11856#discussion_r999273020 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -4190,7 +4190,7 @@ private static Status.SoftDeletsStatus checkSoftDeletes( } private stati

[GitHub] [lucene] rmuir commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
rmuir commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999278417 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, FieldEntr

[GitHub] [lucene] rmuir commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
rmuir commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1283842393 Without a test, I can't tell that the PR fixes the issue. There might be more problems lurking in other vectors code. We shouldn't play whack-a-mole with releases. I think we should add a

[GitHub] [lucene] donnerpeter merged pull request #11859: hunspell: speed up GeneratingSuggester by not deserializing non-suggestible roots

2022-10-19 Thread GitBox
donnerpeter merged PR #11859: URL: https://github.com/apache/lucene/pull/11859 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene

[GitHub] [lucene-solr] cuckooM closed pull request #639: Solve the problem of highlighting Chinese inaccurately.

2022-10-19 Thread GitBox
cuckooM closed pull request #639: Solve the problem of highlighting Chinese inaccurately. URL: https://github.com/apache/lucene-solr/pull/639 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [lucene] uschindler commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
uschindler commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999447512 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, Fiel

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999637833 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, Fie

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999709822 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, Fie

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999709822 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, Fie

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999709822 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, Fie

[GitHub] [lucene] zhaih commented on pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-19 Thread GitBox
zhaih commented on PR #11840: URL: https://github.com/apache/lucene/pull/11840#issuecomment-1284300487 > About the backport to 9.x I will help soon, at moment I am not well due to COVID after a conference last week. Thanks Uwe! No hurries and take care! -- This is an automated mess

[GitHub] [lucene] zhaih merged pull request #11840: GITHUB-11838 Add api to allow concurrent query rewrite

2022-10-19 Thread GitBox
zhaih merged PR #11840: URL: https://github.com/apache/lucene/pull/11840 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[GitHub] [lucene] zhaih commented on issue #11838: Adding concurrency to query rewrite?

2022-10-19 Thread GitBox
zhaih commented on issue #11838: URL: https://github.com/apache/lucene/issues/11838#issuecomment-1284303658 Change to main branch merged, keep this open until backport finishes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [lucene] rmuir commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
rmuir commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999716253 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, FieldEntr

[GitHub] [lucene-solr] risdenk commented on pull request #1676: SOLR-13973: Depricate Tika support in 8.7

2022-10-19 Thread GitBox
risdenk commented on PR #1676: URL: https://github.com/apache/lucene-solr/pull/1676#issuecomment-1284317351 https://issues.apache.org/jira/browse/SOLR-13973 decided not to move forward with this. Also would need to apply to apache/solr repo instead. -- This is an automated message from th

[GitHub] [lucene-solr] risdenk closed pull request #1676: SOLR-13973: Depricate Tika support in 8.7

2022-10-19 Thread GitBox
risdenk closed pull request #1676: SOLR-13973: Depricate Tika support in 8.7 URL: https://github.com/apache/lucene-solr/pull/1676 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene-solr] risdenk commented on pull request #1383: SOLR-14367: Updated Tika version to 1.24

2022-10-19 Thread GitBox
risdenk commented on PR #1383: URL: https://github.com/apache/lucene-solr/pull/1383#issuecomment-1284319127 This has been upgraded elsewhere along the way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene-solr] risdenk closed pull request #1383: SOLR-14367: Updated Tika version to 1.24

2022-10-19 Thread GitBox
risdenk closed pull request #1383: SOLR-14367: Updated Tika version to 1.24 URL: https://github.com/apache/lucene-solr/pull/1383 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene-solr] risdenk closed pull request #1622: SOLR-14603: update Restlet version

2022-10-19 Thread GitBox
risdenk closed pull request #1622: SOLR-14603: update Restlet version URL: https://github.com/apache/lucene-solr/pull/1622 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [lucene-solr] risdenk commented on pull request #1622: SOLR-14603: update Restlet version

2022-10-19 Thread GitBox
risdenk commented on PR #1622: URL: https://github.com/apache/lucene-solr/pull/1622#issuecomment-1284321402 Closing since sasys this was merged - also restlet was removed down the line anyway - https://issues.apache.org/jira/browse/SOLR-14659 -- This is an automated message from the Apach

[GitHub] [lucene-solr] risdenk commented on pull request #1858: LUCENE-6744: equals methods should compare classes directly, not use instanceof

2022-10-19 Thread GitBox
risdenk commented on PR #1858: URL: https://github.com/apache/lucene-solr/pull/1858#issuecomment-1284330471 So interestingly errorprone for Solr ends up going the other direction - https://errorprone.info/bugpattern/EqualsGetClass -- This is an automated message from the Apache Git Servic

[GitHub] [lucene-solr] risdenk closed pull request #1858: LUCENE-6744: equals methods should compare classes directly, not use instanceof

2022-10-19 Thread GitBox
risdenk closed pull request #1858: LUCENE-6744: equals methods should compare classes directly, not use instanceof URL: https://github.com/apache/lucene-solr/pull/1858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [lucene] risdenk commented on issue #7802: equals methods should compare classes directly, not use instanceof [LUCENE-6744]

2022-10-19 Thread GitBox
risdenk commented on issue #7802: URL: https://github.com/apache/lucene/issues/7802#issuecomment-1284331456 So at least for Solr - errorprone suggest going the other direction - https://errorprone.info/bugpattern/EqualsGetClass -- This is an automated message from the Apache Git Service.

[GitHub] [lucene] jtibshirani commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999743022 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, Fie

[GitHub] [lucene] rmuir commented on a diff in pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
rmuir commented on code in PR #11861: URL: https://github.com/apache/lucene/pull/11861#discussion_r999752419 ## lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java: ## @@ -175,7 +175,7 @@ private void validateFieldEntry(FieldInfo info, FieldEntr

[GitHub] [lucene] jtibshirani commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1284348388 I'm working on a monster test `TestManyKnnVectors` that indexes a bunch of vectors and force merges. I didn't mention this explicitly, but I also did extensive testing using the Elas

[GitHub] [lucene] rmuir commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
rmuir commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1284358987 Thank you. There are still many benefits to testing (e.g. -ea flag) vs a benchmark which is generally less picky. Also if there are different supported modes (e.g. 4-byte vs 1-byte) for ve

[GitHub] [lucene] zhaih opened a new issue, #11862: Concurrent rewrite for KnnVectorQuery

2022-10-19 Thread GitBox
zhaih opened a new issue, #11862: URL: https://github.com/apache/lucene/issues/11862 ### Description #11840 allows query rewrite to be parallelized, we should try to have an implementation for KNN query make use of that? -- This is an automated message from the Apache Git Service.

[GitHub] [lucene] jtibshirani opened a new issue, #11863: Add large-scale test for kNN vectors

2022-10-19 Thread GitBox
jtibshirani opened a new issue, #11863: URL: https://github.com/apache/lucene/issues/11863 We recently had a regression where the kNN vectors format validation could fail on large segments. We didn't catch this in testing or nightly performance benchmarks because they didn't produce large e

[GitHub] [lucene] jtibshirani commented on pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani commented on PR #11861: URL: https://github.com/apache/lucene/pull/11861#issuecomment-1284401741 Thanks for the reviews! I started to write a monster test, but it will take some time since the iteration cycle is long (each run can take 2+ hours). I'd like to merge this now and g

[GitHub] [lucene] matriv commented on a diff in pull request #11856: Fix nanos to millis conversion for tests

2022-10-19 Thread GitBox
matriv commented on code in PR #11856: URL: https://github.com/apache/lucene/pull/11856#discussion_r999887367 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -4190,7 +4190,7 @@ private static Status.SoftDeletsStatus checkSoftDeletes( } private stat

[GitHub] [lucene] jtibshirani closed issue #11858: Lucene94HnswVectorsFormat validation fails with large datasets

2022-10-19 Thread GitBox
jtibshirani closed issue #11858: Lucene94HnswVectorsFormat validation fails with large datasets URL: https://github.com/apache/lucene/issues/11858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] jtibshirani merged pull request #11861: Fix Lucene94HnswVectorsFormat validation on large segments

2022-10-19 Thread GitBox
jtibshirani merged PR #11861: URL: https://github.com/apache/lucene/pull/11861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene

[GitHub] [lucene] stevenschlansker commented on pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-19 Thread GitBox
stevenschlansker commented on PR #11822: URL: https://github.com/apache/lucene/pull/11822#issuecomment-1284708755 OK - I did run `./gradlew check` so I don't think I broke anything, but please let me know if it does end up being related! -- This is an automated message from the Apache Git

[GitHub] [lucene] zhaih commented on pull request #11822: PrimaryNode: add configurable timeout to waitForAllRemotesToClose

2022-10-19 Thread GitBox
zhaih commented on PR #11822: URL: https://github.com/apache/lucene/pull/11822#issuecomment-1284755158 I tried to reproduce the issue but couldn't. So likely a transient or extremely rare test failure, and should not be related to the PR On Wed, Oct 19, 2022, 16:44 Steven Schlansker