[GitHub] [lucene] gf2121 commented on pull request #12465: Potential bug in IndexedDISI90#SPARSE->advanceExactWithinBlock

2023-07-28 Thread via GitHub
gf2121 commented on PR #12465: URL: https://github.com/apache/lucene/pull/12465#issuecomment-1655202239 > For example, what if we assigned disi.exists to false here? Sorry but I do not think we should do this change indeed : 1. If a field exists in doc 1, 3. Calling `advanceExac

[GitHub] [lucene] dweiss commented on a diff in pull request #12464: hunspell: make the hash table load factor customizable

2023-07-28 Thread via GitHub
dweiss commented on code in PR #12464: URL: https://github.com/apache/lucene/pull/12464#discussion_r1277381583 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/WordStorage.java: ## @@ -390,7 +391,7 @@ private int flushGroup() throws IOException { i

[GitHub] [lucene] mayya-sharipova merged pull request #12466: Make KnnVectorsFormat#getMaxDimensions abstract

2023-07-28 Thread via GitHub
mayya-sharipova merged PR #12466: URL: https://github.com/apache/lucene/pull/12466 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lu

[GitHub] [lucene] mayya-sharipova opened a new pull request, #12467: Fix failure in BaseKnnVectorsFormatTestCase#testIllegalDimensionTooLarge

2023-07-28 Thread via GitHub
mayya-sharipova opened a new pull request, #12467: URL: https://github.com/apache/lucene/pull/12467 Depending whether a document with dimensions > maxDims created on a new segment or already existing segment, we may get different error messages. This fix adds another possible error message

[GitHub] [lucene] mayya-sharipova merged pull request #12467: Fix failure in BaseKnnVectorsFormatTestCase#testIllegalDimensionTooLarge

2023-07-28 Thread via GitHub
mayya-sharipova merged PR #12467: URL: https://github.com/apache/lucene/pull/12467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lu

[GitHub] [lucene] jpountz commented on pull request #12436: Move max vector dims limit to Codec

2023-07-28 Thread via GitHub
jpountz commented on PR #12436: URL: https://github.com/apache/lucene/pull/12436#issuecomment-1655817219 > One question I have: What happens if you open an index with a higher limit in field infos and you use default codec? I think this is unsupported, but in that case the implementor of th

[GitHub] [lucene] uschindler commented on pull request #12436: Move max vector dims limit to Codec

2023-07-28 Thread via GitHub
uschindler commented on PR #12436: URL: https://github.com/apache/lucene/pull/12436#issuecomment-1655944398 > > One question I have: What happens if you open an index with a higher limit in field infos and you use default codec? I think this is unsupported, but in that case the implementor

[GitHub] [lucene] donnerpeter merged pull request #12464: hunspell: make the hash table load factor customizable

2023-07-28 Thread via GitHub
donnerpeter merged PR #12464: URL: https://github.com/apache/lucene/pull/12464 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene

[GitHub] [lucene] donnerpeter opened a new pull request, #12468: hunspell: check for aff file wellformedness more strictly

2023-07-28 Thread via GitHub
donnerpeter opened a new pull request, #12468: URL: https://github.com/apache/lucene/pull/12468 ### Description -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [lucene] donnerpeter commented on a diff in pull request #12468: hunspell: check for aff file wellformedness more strictly

2023-07-28 Thread via GitHub
donnerpeter commented on code in PR #12468: URL: https://github.com/apache/lucene/pull/12468#discussion_r128738 ## lucene/analysis/common/src/test/org/apache/lucene/analysis/hunspell/TestAllDictionaries.java: ## @@ -74,7 +74,12 @@ static Dictionary loadDictionary(Path aff) t

[GitHub] [lucene] benwtrent commented on issue #12342: Prevent VectorSimilarity.DOT_PRODUCT from returning negative scores

2023-07-28 Thread via GitHub
benwtrent commented on issue #12342: URL: https://github.com/apache/lucene/issues/12342#issuecomment-1656112434 OK ran on two more datasets, I only ran over the first 100k documents in my data sets. I think this exhausts our testing for Cohere, we need to find additional data if this

[GitHub] [lucene] original-brownbear opened a new pull request, #12469: Clenup duplication in BKDWriter

2023-07-28 Thread via GitHub
original-brownbear opened a new pull request, #12469: URL: https://github.com/apache/lucene/pull/12469 The logic for creating the writer runnable could be deduplicated. Also, a couple of annonymous classes could be turned into lambdas (avoiding a redundant `ByteBuf` instantiation in the pro

[GitHub] [lucene] gsmiller commented on pull request #12454: Clean up ordinal map in default SSDV reader state

2023-07-28 Thread via GitHub
gsmiller commented on PR #12454: URL: https://github.com/apache/lucene/pull/12454#issuecomment-1656481929 > Overall, I like the previous code a bit better, it's cleaner and more concise, although cachedOrdMaps looks confusing at first. Maybe it just needs a comment explaining why using a ma

[GitHub] [lucene] gsmiller commented on pull request #12428: Replace consecutive close() calls and close() calls with null checks with IOUtils.close()

2023-07-28 Thread via GitHub
gsmiller commented on PR #12428: URL: https://github.com/apache/lucene/pull/12428#issuecomment-1656506806 LGTM as well. Thanks @shubhamvishu! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s