Re: [PR] Remove all security manager and java security references [lucene]

2025-06-17 Thread via GitHub
dweiss merged PR #14801: URL: https://github.com/apache/lucene/pull/14801 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Convert more PriorityQueues to use Comparator [lucene]

2025-06-17 Thread via GitHub
dweiss commented on PR #14761: URL: https://github.com/apache/lucene/pull/14761#issuecomment-2982845136 I've merged this into main. Perhaps we should add a marker to benchmarks, @mikemccand ? -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] Convert more PriorityQueues to use Comparator [lucene]

2025-06-17 Thread via GitHub
dweiss merged PR #14761: URL: https://github.com/apache/lucene/pull/14761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Add a linter flag to suppress warning about incubating vector module. [lucene]

2025-06-17 Thread via GitHub
dweiss merged PR #14802: URL: https://github.com/apache/lucene/pull/14802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] deps(java): bump org.owasp.dependencycheck from 12.1.2 to 12.1.3 [lucene]

2025-06-17 Thread via GitHub
dweiss merged PR #14805: URL: https://github.com/apache/lucene/pull/14805 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [I] Compression cache of numeric docvalues [lucene]

2025-06-17 Thread via GitHub
gf2121 commented on issue #14803: URL: https://github.com/apache/lucene/issues/14803#issuecomment-2982794733 Thanks for feedback! I agree that a transparent compression filesystem is pretty straightforward and helpful. But i suspect it is hard for user to know when Lucene can take c

Re: [PR] Make `pack` methods public for `BigIntegerPoint` and `HalfFloatPoint` [lucene]

2025-06-17 Thread via GitHub
prudhvigodithi commented on PR #14784: URL: https://github.com/apache/lucene/pull/14784#issuecomment-2982201768 Just pushed a commit to fix the conflicts. @jpountz a gentle follow up to see if we are ok to merge this change. Thanks -- This is an automated message from the Apache G

[PR] deps(java): bump org.owasp.dependencycheck from 12.1.2 to 12.1.3 [lucene]

2025-06-17 Thread via GitHub
dependabot[bot] opened a new pull request, #14805: URL: https://github.com/apache/lucene/pull/14805 Bumps org.owasp.dependencycheck from 12.1.2 to 12.1.3. [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.owas

Re: [I] A multi-tenant ConcurrentMergeScheduler [lucene]

2025-06-17 Thread via GitHub
vigyasharma commented on issue #13883: URL: https://github.com/apache/lucene/issues/13883#issuecomment-2982081708 Thanks @yaser-aj , happy to see progress on this project. You're on the right track with understanding the problem. We want CMS to be aware of merge demands across IndexWr

Re: [PR] Add a DoubleValuesSource for scoring full precision vector similarity [lucene]

2025-06-17 Thread via GitHub
vigyasharma merged PR #14708: URL: https://github.com/apache/lucene/pull/14708 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-17 Thread via GitHub
kaivalnp commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2981931299 Thanks @mikemccand, I've made the suggested changes + rebased + improved some documentation! > One could maybe use ulimit so the kernel will return null if the process tries to a

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-17 Thread via GitHub
kaivalnp commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2153208743 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/package-info.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-17 Thread via GitHub
rmuir commented on PR #14804: URL: https://github.com/apache/lucene/pull/14804#issuecomment-2981726602 Can we consider ast-grep for this? it is really fast and doesn't require regular expressions, has plugins for editors. I wrote a rule for this in less than a minute: ```yaml id: wild

Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-17 Thread via GitHub
github-actions[bot] commented on PR #14804: URL: https://github.com/apache/lucene/pull/14804#issuecomment-2981718232 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

[PR] detect and ban wildcard imports in Java [lucene]

2025-06-17 Thread via GitHub
dweiss opened a new pull request, #14804: URL: https://github.com/apache/lucene/pull/14804 Fixes #14553. I'm not completely happy with this. For some reason, the custom formatting step always triggers full spotless run - incremental mode doesn't work. ``` > ./gradlew -p luce

Re: [I] Compression cache of numeric docvalues [lucene]

2025-06-17 Thread via GitHub
rmuir commented on issue #14803: URL: https://github.com/apache/lucene/issues/14803#issuecomment-2981632167 IMO: just use a filesystem with this feature such as zfs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] UnsupportedOperation when merging `Lucene90BlockTreeTermsWriter` [lucene]

2025-06-17 Thread via GitHub
benwtrent commented on issue #14429: URL: https://github.com/apache/lucene/issues/14429#issuecomment-2981454239 Working more on this, we have ran multiple diagnostics on the machines, no hardware issues seem to arise. This issue arises not only on merge, but I have seen it on flush.

Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-17 Thread via GitHub
msokolov commented on PR #14792: URL: https://github.com/apache/lucene/pull/14792#issuecomment-2981195460 For the return values use case, another choice is to disable it in the case the original vectors were not "stored" in the searchable index. Otherwise, I agree with Ben that we could sup

[I] Compression cache of numeric docvalues [lucene]

2025-06-17 Thread via GitHub
gf2121 opened a new issue, #14803: URL: https://github.com/apache/lucene/issues/14803 ### Description When benchmarking recently with some OLAP engines (no indexes, no stored fields, only column data), the results showed that they only occupy 50-70% of the storage of `NumericDocvalue

Re: [I] Make HNSW merges cheaper on heap [lucene]

2025-06-17 Thread via GitHub
ChrisHegarty commented on issue #14208: URL: https://github.com/apache/lucene/issues/14208#issuecomment-2981080469 The on-heap memory used for the per-node neighbour array during building the HNSW graph has been significantly reduced, by approximately 3-4x, see https://github.com/apache/luc

Re: [I] Expand TieredMergePolicy deletePctAllowed limits [lucene]

2025-06-17 Thread via GitHub
jpountz commented on issue #11761: URL: https://github.com/apache/lucene/issues/11761#issuecomment-2980907072 I think I'd be ok with any lower bound that is strictly greater than 0. However, I am curious if the improvement that you are seeing is actually due to reducing the number of deleti

Re: [I] Support for DocIdSetBuilder with (min,max) docId [lucene]

2025-06-17 Thread via GitHub
prudhvigodithi commented on issue #14485: URL: https://github.com/apache/lucene/issues/14485#issuecomment-2980630462 Filtering out out of range docs prevents giant bit-sets, so we can only add the docs to the `DocIdSetBuilder` that are within the range of `LeafReaderContextPartition` this p

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-17 Thread via GitHub
mikemccand commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2152136427 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/package-info.java: ## @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-06-17 Thread via GitHub
mikemccand commented on PR #14178: URL: https://github.com/apache/lucene/pull/14178#issuecomment-2980130792 > As a follow up, could you allow the [`mamba-org/setup-micromamba`](https://github.com/mamba-org/setup-micromamba) GH action to run on the Lucene repository -- so that the Faiss code

Re: [PR] Adjust base knn format assert assertOffHeapByteSize [lucene]

2025-06-17 Thread via GitHub
benwtrent merged PR #14797: URL: https://github.com/apache/lucene/pull/14797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] .editorconfig [lucene]

2025-06-17 Thread via GitHub
dsmiley commented on PR #14740: URL: https://github.com/apache/lucene/pull/14740#issuecomment-2979904591 OMG that's ironic! @rmuir, you added it (in March), and it only configures Python :-) LOL Okay... well I think that file should be removed and it's python section integrated in

Re: [PR] Add a linter flag to suppress warning about incubating vector module. [lucene]

2025-06-17 Thread via GitHub
github-actions[bot] commented on PR #14802: URL: https://github.com/apache/lucene/pull/14802#issuecomment-2979902324 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

[PR] Add a linter flag to suppress warning about incubating vector module. [lucene]

2025-06-17 Thread via GitHub
dweiss opened a new pull request, #14802: URL: https://github.com/apache/lucene/pull/14802 (no comment) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-ma

[PR] Remove all security manager and java security references [lucene]

2025-06-17 Thread via GitHub
dweiss opened a new pull request, #14801: URL: https://github.com/apache/lucene/pull/14801 these are no-ops in JDK24+. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] Remove all security manager and java security references [lucene]

2025-06-17 Thread via GitHub
dweiss commented on code in PR #14801: URL: https://github.com/apache/lucene/pull/14801#discussion_r2151947884 ## build-tools/build-infra/src/main/groovy/lucene.validation.ecj-lint.gradle: ## @@ -74,6 +76,8 @@ def lintTasks = sourceSets.collect { SourceSet sourceSet -> depe

Re: [I] build and push release regression [lucene]

2025-06-17 Thread via GitHub
dweiss closed issue #14786: build and push release regression URL: https://github.com/apache/lucene/issues/14786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [PR] Correct python release scripts for the new location of base version [lucene]

2025-06-17 Thread via GitHub
dweiss merged PR #14798: URL: https://github.com/apache/lucene/pull/14798 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Fix assemble source release [lucene]

2025-06-17 Thread via GitHub
dweiss merged PR #14800: URL: https://github.com/apache/lucene/pull/14800 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [I] Fix regression in assembleSourceTgz [lucene]

2025-06-17 Thread via GitHub
dweiss closed issue #14796: Fix regression in assembleSourceTgz URL: https://github.com/apache/lucene/issues/14796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [I] org.apache.lucene.search.TestPatienceFloatVectorQuery.testFindAll failed [lucene]

2025-06-17 Thread via GitHub
tteofili commented on issue #14694: URL: https://github.com/apache/lucene/issues/14694#issuecomment-2979300604 @benwtrent yeah, exactly, I think that's what we're seeing here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-17 Thread via GitHub
dweiss closed issue #14785: Revert back to jgit for collecting git status URL: https://github.com/apache/lucene/issues/14785 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-17 Thread via GitHub
dweiss commented on issue #14785: URL: https://github.com/apache/lucene/issues/14785#issuecomment-2979190779 Thanks, Uwe. > The "working copy clean" check was faster and better implemented with jgit It's not that bad, really - the format of the git tool's status may be a bit od

Re: [PR] Fix assemble source release [lucene]

2025-06-17 Thread via GitHub
github-actions[bot] commented on PR #14800: URL: https://github.com/apache/lucene/pull/14800#issuecomment-2979172131 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop