Re: [I] Potential resource leakage in WordDictionary#loadMainDataFromFile [lucene]

2025-05-27 Thread via GitHub
xcx1r3 commented on issue #14719: URL: https://github.com/apache/lucene/issues/14719#issuecomment-2911474550 if an exception occur, the close() statement will not be executed, leading to a potential resource leak. ``` private int loadMainDataFromFile(String dctFilePath) throws IOExcept

Re: [PR] Use read advice consistently in the knn vector formats [lucene]

2025-05-27 Thread via GitHub
jimczi closed pull request #14076: Use read advice consistently in the knn vector formats URL: https://github.com/apache/lucene/pull/14076 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] Update ruff rule PATH103 to enforce modern os.makedirs usage [lucene]

2025-05-27 Thread via GitHub
rmuir merged PR #14710: URL: https://github.com/apache/lucene/pull/14710 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

Re: [PR] Cache high-order bits of hashcode to speed up BytesRefHash [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14720: URL: https://github.com/apache/lucene/pull/14720#issuecomment-2912485390 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

[PR] Cache high-order bits of hashcode to speed up BytesRefHash [lucene]

2025-05-27 Thread via GitHub
bugmakerr opened a new pull request, #14720: URL: https://github.com/apache/lucene/pull/14720 ### Description This PR tries to utilize the unused part of the id to cache the high-order bits of the hashcode to speed up `BytesRefHash`. I used 1 million 16-byte UUIDs to [ben

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109429583 ## gradle/testing/defaults-tests.gradle: ## @@ -145,6 +145,7 @@ allprojects { ':lucene:core', ':lucene:codecs', ":lucene:

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-27 Thread via GitHub
benwtrent merged PR #14527: URL: https://github.com/apache/lucene/pull/14527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [I] Nightly benchmark regression on 2025.05.01 [lucene]

2025-05-27 Thread via GitHub
jpountz commented on issue #14630: URL: https://github.com/apache/lucene/issues/14630#issuecomment-2913813621 It looks like nightly benchmarks only run every 2 days since May 13th, vs. every day before that. Is this because it now takes longer to run the benchmark? -- This is an automated

Re: [I] Potential resource leakage in WordDictionary#loadMainDataFromFile [lucene]

2025-05-27 Thread via GitHub
jpountz commented on issue #14719: URL: https://github.com/apache/lucene/issues/14719#issuecomment-2913817299 Good catch, would you like to submit a PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] Only run the labeller on the main branch of the lucene repository [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14721: URL: https://github.com/apache/lucene/pull/14721#issuecomment-2913824556 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Fix comment above OnHeapHnswGraph#getNeighbors. [lucene]

2025-05-27 Thread via GitHub
msokolov commented on PR #14713: URL: https://github.com/apache/lucene/pull/14713#issuecomment-2913825952 Thanks @vsop-479 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Fix comment above OnHeapHnswGraph#getNeighbors. [lucene]

2025-05-27 Thread via GitHub
msokolov merged PR #14713: URL: https://github.com/apache/lucene/pull/14713 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.ap

[PR] Only run the labeller on the main branch of the lucene repository [lucene]

2025-05-27 Thread via GitHub
dweiss opened a new pull request, #14721: URL: https://github.com/apache/lucene/pull/14721 This prevents this action from running on PR against forks, which I couldn't get to work (missing permissions for some reason). -- This is an automated message from the Apache Git Service. To respon

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-27 Thread via GitHub
benwtrent commented on code in PR #14527: URL: https://github.com/apache/lucene/pull/14527#discussion_r2109471013 ## .gitignore: ## @@ -32,3 +32,10 @@ __pycache__ # SDKMAN .sdkmanrc + +# Java class files +*.class + +# Ignore bin directories +bin/ +**/bin/ Review Comment:

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109488907 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsReader.java: ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Fix resource leak in loadMainDataFromFile [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14726: URL: https://github.com/apache/lucene/pull/14726#issuecomment-2914833524 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

[PR] Fix resource leak in loadMainDataFromFile [lucene]

2025-05-27 Thread via GitHub
xcx1r3 opened a new pull request, #14726: URL: https://github.com/apache/lucene/pull/14726 Use try-with-resources to auto-close DataInputStream ``` try (DataInputStream dctFile = new DataInputStream(Files.newInputStream(Paths.get(dctFilePath { ... } -- This is an automat

Re: [I] Potential resource leakage in WordDictionary#loadMainDataFromFile [lucene]

2025-05-27 Thread via GitHub
xcx1r3 commented on issue #14719: URL: https://github.com/apache/lucene/issues/14719#issuecomment-2914834339 #14726 sure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Fix resource leak in loadMainDataFromFile [lucene]

2025-05-27 Thread via GitHub
xcx1r3 closed pull request #14726: Fix resource leak in loadMainDataFromFile URL: https://github.com/apache/lucene/pull/14726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109735507 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsReader.java: ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109760361 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsWriter.java: ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109774033 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsWriter.java: ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109779193 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsWriter.java: ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Cache high-order bits of hashcode to speed up BytesRefHash [lucene]

2025-05-27 Thread via GitHub
jpountz commented on code in PR #14720: URL: https://github.com/apache/lucene/pull/14720#discussion_r2110084706 ## lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java: ## @@ -71,9 +72,13 @@ public BytesRefHash(ByteBlockPool pool) { /** Creates a new {@link BytesRe

Re: [PR] Move HitQueue in TopScoreDocCollector to a LongHeap [lucene]

2025-05-27 Thread via GitHub
jpountz commented on PR #14714: URL: https://github.com/apache/lucene/pull/14714#issuecomment-2913896479 I wasn't aware of this indeed. OK for passing null then, I agree that there may be sub classes that rely on this API in the wild. -- This is an automated message from the Apache Git Se

Re: [PR] Arg001 - no violations found [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14724: URL: https://github.com/apache/lucene/pull/14724#issuecomment-2914476284 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Arg001 - no violations found [lucene]

2025-05-27 Thread via GitHub
Mariah33 commented on PR #14724: URL: https://github.com/apache/lucene/pull/14724#issuecomment-2914477522 on wrong branch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] Clarify filter fields usage in javadocs [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14660: URL: https://github.com/apache/lucene/pull/14660#issuecomment-2914507503 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] No ruff violation [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14725: URL: https://github.com/apache/lucene/pull/14725#issuecomment-2914529256 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

[PR] No ruff violation [lucene]

2025-05-27 Thread via GitHub
Mariah33 opened a new pull request, #14725: URL: https://github.com/apache/lucene/pull/14725 ### Description Didn't find these ruff rules in the code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] Arg001 - no violations found [lucene]

2025-05-27 Thread via GitHub
Mariah33 opened a new pull request, #14724: URL: https://github.com/apache/lucene/pull/14724 ### Description This rule was not found in the code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] Arg001 - no violations found [lucene]

2025-05-27 Thread via GitHub
Mariah33 closed pull request #14724: Arg001 - no violations found URL: https://github.com/apache/lucene/pull/14724 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Apply minimal fix for ruff rule PATH103 using Path.resolve [lucene]

2025-05-27 Thread via GitHub
rmuir merged PR #14711: URL: https://github.com/apache/lucene/pull/14711 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apach

[PR] deps(java): bump org.apache.groovy:groovy-all from 4.0.26 to 4.0.27 [lucene]

2025-05-27 Thread via GitHub
dependabot[bot] opened a new pull request, #14722: URL: https://github.com/apache/lucene/pull/14722 Bumps [org.apache.groovy:groovy-all](https://github.com/apache/groovy) from 4.0.26 to 4.0.27. Commits See full diff in https://github.com/apache/groovy/commits";>compare view

[PR] deps(java): bump com.diffplug.spotless from 7.0.3 to 7.0.4 [lucene]

2025-05-27 Thread via GitHub
dependabot[bot] opened a new pull request, #14723: URL: https://github.com/apache/lucene/pull/14723 Bumps com.diffplug.spotless from 7.0.3 to 7.0.4. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=com.diffplug.s

Re: [PR] deps(java): bump org.apache.groovy:groovy-all from 4.0.26 to 4.0.27 [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14722: URL: https://github.com/apache/lucene/pull/14722#issuecomment-2914414329 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] deps(java): bump com.diffplug.spotless from 7.0.3 to 7.0.4 [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14723: URL: https://github.com/apache/lucene/pull/14723#issuecomment-2914414463 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Apply minimal fix for ruff rule PATH103 using Path.resolve [lucene]

2025-05-27 Thread via GitHub
github-actions[bot] commented on PR #14711: URL: https://github.com/apache/lucene/pull/14711#issuecomment-2914446223 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109479695 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsFormat.java: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [PR] Move HitQueue in TopScoreDocCollector to a LongHeap [lucene]

2025-05-27 Thread via GitHub
gf2121 commented on PR #14714: URL: https://github.com/apache/lucene/pull/14714#issuecomment-2913036055 Thanks for the suggestion! > It's a bit ugly to pass null as a HitQueue in the constructor of TopScoreDocCollector. Can we only keep method signatures on TopDocsCollector and move

Re: [PR] Adding profiling support for concurrent segment search [lucene]

2025-05-27 Thread via GitHub
jainankitk commented on PR #14413: URL: https://github.com/apache/lucene/pull/14413#issuecomment-2913552519 I submitted talk on this topic (`Profiling Concurrent Search in Lucene: A Deep Dive into Parallel Execution`) for ASF conference (https://communityovercode.org/schedule/) and it was s

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-27 Thread via GitHub
weizijun commented on code in PR #14527: URL: https://github.com/apache/lucene/pull/14527#discussion_r2109476565 ## .gitignore: ## @@ -32,3 +32,10 @@ __pycache__ # SDKMAN .sdkmanrc + +# Java class files +*.class + +# Ignore bin directories +bin/ +**/bin/ Review Comment:

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109729290 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsReader.java: ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-27 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2109499282 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsFormat.java: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (

Re: [I] Create a bot to add milestones to new PRs [lucene]

2025-05-27 Thread via GitHub
stefanvodita commented on issue #14190: URL: https://github.com/apache/lucene/issues/14190#issuecomment-2913105749 #14697 is a nice example of the bot modifying the milestone after we moved the CHANGES entry to a different section! -- This is an automated message from the Apache Git Servi

Re: [PR] Reduce NeighborArray heap memory [lucene]

2025-05-27 Thread via GitHub
weizijun commented on PR #14527: URL: https://github.com/apache/lucene/pull/14527#issuecomment-2914720940 Here are the statistics of 100w hnsw graphs, with m = 16 and ef = 100: Level count = 5: ``` level: 0, node count: 100 level: 1, node count: 62835 level: 2, node count: