Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


uschindler commented on PR #14788:
URL: https://github.com/apache/lucene/pull/14788#issuecomment-2975442200

   Can you backport this change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14788:
URL: https://github.com/apache/lucene/pull/14788#issuecomment-2975466304

   I will apply and backport, no worries, Uwe. Enjoy a beer in Berlin!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


Pulkitg64 opened a new pull request, #14792:
URL: https://github.com/apache/lucene/pull/14792

   ### Description
   
   Introduce `getQuantizedVectorValues` method in `LeafReader` to access 
`QuantizedVectorValues`.
   
   In a search architecture where searchers and writer runs on separate 
machine, it is wasteful to have raw float vectors on machine when vector 
quantization enabled. This PR is adding getQuantizedVectorValues in LeafReader 
which will help to read QuantizedByteVectors directly without need of reading 
raw float vectors.
   
   Partially solving #13158 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2976186238

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] compileMain24Java fails with JDK25+ [lucene]

2025-06-16 Thread via GitHub


uschindler commented on issue #14782:
URL: https://github.com/apache/lucene/issues/14782#issuecomment-2975525126

   I updated the JDK bug report.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Override ValueSource.FromDoubleValuesSource.getSortField [lucene]

2025-06-16 Thread via GitHub


dsmiley commented on PR #14654:
URL: https://github.com/apache/lucene/pull/14654#issuecomment-2976307246

   Done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Use IntArrayList/IntHashSet to replace usages of List/Set of Integer [lucene]

2025-06-16 Thread via GitHub


easyice merged PR #14774:
URL: https://github.com/apache/lucene/pull/14774


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] [Build] Fix more gradle deprecation warnings scheduled to be removed in 9.0 [lucene]

2025-06-16 Thread via GitHub


dweiss merged PR #14783:
URL: https://github.com/apache/lucene/pull/14783


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] Add an option to turn off gradle/groovy spotless checks/validations [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new issue, #14789:
URL: https://github.com/apache/lucene/issues/14789

   ### Description
   
   These download huge amount of data and are slow to apply. Makes sense to 
have the ability to turn them off locally. I would like to keep them on the CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Override ValueSource.FromDoubleValuesSource.getSortField [lucene]

2025-06-16 Thread via GitHub


ChrisHegarty commented on PR #14654:
URL: https://github.com/apache/lucene/pull/14654#issuecomment-2976032407

   @dsmiley Given that this change has been backported to 9.12.2, which will 
release in tandem with 10.2.2, do you want to port this change to `branch_10_2` 
also? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Fix jacoco/ coverage plugin not found [lucene]

2025-06-16 Thread via GitHub


dweiss closed issue #14790: Fix jacoco/ coverage plugin not found
URL: https://github.com/apache/lucene/issues/14790


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] .editorconfig [lucene]

2025-06-16 Thread via GitHub


dsmiley commented on PR #14740:
URL: https://github.com/apache/lucene/pull/14740#issuecomment-2976114609

   I plan to merge this tonight (~10pm EST) if I don't hear advise to the 
contrary.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14764:
URL: https://github.com/apache/lucene/pull/14764#issuecomment-2975342904

   > Please keep this for longer time, as the Jenkins build randomization can 
only work on environment variables. There maybe tweaks to pass this on command 
line in Jenkins's gradle options, but I don't want to touch that now.
   
   No worries at all. These build options all work with env variables already 
but their naming is identical to the build option name (so lowercased, with 
dots, etc.). Maybe this could be improved/ changed in the plugin itself, I'll 
think about it.
   
   No problem with keeping whatever aliases we like though.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] compileMain24Java fails with JDK25+ [lucene]

2025-06-16 Thread via GitHub


dweiss closed issue #14782: compileMain24Java fails with JDK25+
URL: https://github.com/apache/lucene/issues/14782


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


dweiss merged PR #14788:
URL: https://github.com/apache/lucene/pull/14788


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


uschindler commented on PR #14788:
URL: https://github.com/apache/lucene/pull/14788#issuecomment-2975564900

   If I remember: the order of configuration is still predictable by the order 
of including them into the main build script. I always felt bad with that, but 
it worked. Just want to confirm it also applies to changes in main.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Build refactoring and cleanups (moving from build scripts to convention plugins) [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14764:
URL: https://github.com/apache/lucene/pull/14764#issuecomment-2975350683

   > I'd rather disable this Groovy/Gradle code formatting - sorry for this, 
but this took me ages on battery power today in train while debugging the other 
windows build failures on Jenkins. Whenever I changed a single line of Groovy 
code it printed the code above and sit 5 minutes waiting eating my battery 
power,
   
   Ok, I'll add an option to turn it off, locally. I agree - it's slow and 
huge. But it's the only way to verify/apply formatting automatically that I 
know about.  [#14789](https://github.com/apache/lucene/issues/14789)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] Fix jacoco/ coverage plugin not found [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new issue, #14790:
URL: https://github.com/apache/lucene/issues/14790

   ### Description
   
   https://ci-builds.apache.org/job/Lucene/job/Lucene-Coverage-main/1468/
   org.gradle.api.plugins.UnknownPluginException: Plugin with id 
'org.barfuin.gradle.jacocolog' not found.
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Make `pack` methods public for `BigIntegerPoint` and `HalfFloatPoint` [lucene]

2025-06-16 Thread via GitHub


jpountz commented on PR #14784:
URL: https://github.com/apache/lucene/pull/14784#issuecomment-2976386308

   I'm good with the change, but I'd put the change in 10.3 instead of 10.2.2 
since it's a new feature rather than a bug fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


benwtrent commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2976430071

   @Pulkitg64 I don't understand how this is part of 
https://github.com/apache/lucene/issues/13158
   
   I would have thought the APIs stay the same. Quantization should be able to 
"rehydrate" the quantized vectors into floating point (or whatever the original 
values).
   
   So, the segment, depending on what data it has access to, will:
   
- Return the original doc value floating point vectors
- Rehydrate the quantized values.
   
   Either way, users should still be able to call `float[] vectorValue(int 
ord)`. 
   
   I would think there is a sub-class called `QuantizedFloatVectorValues`, that 
satisfies the `FloatVectorValues` interface. 
   
   But maybe we add an `isApproximate()` or a `extractQuantizedValues()` that 
returns null, or the `QuantizedFloatVectorValues` interface.
   
   But it is likely useless for the user to have access to the quantized bytes 
directly as they don't provide much value without knowing how to use them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add the ability to inverse a Sort [lucene]

2025-06-16 Thread via GitHub


jpountz commented on PR #14775:
URL: https://github.com/apache/lucene/pull/14775#issuecomment-2976435614

   Thinking a bit more about it, adding `searchBefore` sounds like it could 
work. Would you like to give it a try?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Implement `ConstantScoreScorer#nextDocsAndScores` [lucene]

2025-06-16 Thread via GitHub


gf2121 commented on PR #14772:
URL: https://github.com/apache/lucene/pull/14772#issuecomment-2977389388

   Thanks for contribution, it's a nice speedup!
   
   > I don't believe that being able to use Arrays#fill helps much, but maybe 
the fact that this change helps reduce polymorphism does?
   
   I wonder if we can benchmark the following implementation to confirm the 
source of speedup?
   ```
   @Override
   public void nextDocsAndScores(int upTo, Bits liveDocs, 
DocAndFloatFeatureBuffer buffer)
   throws IOException {
 int batchSize = 64;
 buffer.growNoCopy(batchSize);
 int size = 0;
 DocIdSetIterator iterator = iterator();
 for (int doc = iterator.docID(); doc < upTo && size < batchSize; doc = 
iterator.nextDoc()) {
   if (liveDocs == null || liveDocs.get(doc)) {
 buffer.docs[size] = doc;
 buffer.scores[size] = score;
 ++size;
   }
 }
 buffer.size = size;
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


benwtrent commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2977941605

   @Pulkitg64 
   
   Basically, I don't think callers should "know" directly if they are hitting 
quantized vectors or raw. 
   
   Requiring the user to pick the right thing seems unnecessary when we have 
the appropriate interfaces already. Its just all about determining how the 
format itself knows that its missing the `vec` file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Integrate a JVector codec for KNN searches [lucene]

2025-06-16 Thread via GitHub


RKSPD commented on issue #14681:
URL: https://github.com/apache/lucene/issues/14681#issuecomment-2977958739

   I have a working JVector implementation for Lucene from a while ago, (no 
leaford handling, RandomAccessVectorsWriter, etc) and I have benchmarks for 
that version. There are issues like pre-cached exact vector mismatch between 
runs, etc that have been addressed in the new update. I'm working on 
incorporating the new changes, but the codebase is moving very quickly. Do you 
have a schedule of what issues are present/JVector updates and whether it's 
safe for me to port over your work from OpenSearch/JVector to Lucene?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-16 Thread via GitHub


rmuir commented on issue #14785:
URL: https://github.com/apache/lucene/issues/14785#issuecomment-2977983045

   My biggest complaint about jgit is that it doesn't truly match git behavior. 
But I already set special options (e.g. `pull.twohead=recursive`) to work 
around bugs in jgit, so I don't care that strongly either way. From technical 
perspective, the native git is superior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Add an option to turn off gradle/groovy spotless checks/validations [lucene]

2025-06-16 Thread via GitHub


dweiss closed issue #14789: Add an  option to turn off gradle/groovy spotless 
checks/validations
URL: https://github.com/apache/lucene/issues/14789


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Allow turning off gradle/groovy script spotless pass using 'lucene.spotlessGradleScripts' build option [lucene]

2025-06-16 Thread via GitHub


dweiss merged PR #14791:
URL: https://github.com/apache/lucene/pull/14791


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] .editorconfig [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14740:
URL: https://github.com/apache/lucene/pull/14740#issuecomment-2979065537

   Just FYI - I noticed we already have an editorconfig file in scripts - 
   https://github.com/apache/lucene/blob/main/dev-tools/scripts/.editorconfig
   
   is this complementary? Could it be moved to the top level as part of this 
patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-16 Thread via GitHub


uschindler commented on issue #14785:
URL: https://github.com/apache/lucene/issues/14785#issuecomment-2979121217

   Hi,
   We can keep it as is, all fine.
   
   The "working copy clean" check was faster and better implemented with jgit, 
but for normal stuff like checking out/exporting using native git is better.
   
   I think the problems appeared on main branch because due to removing the 
working copy status groovy code, the jgit dependency was no longer resolved and 
therefore other parts in the Gradle build system were no longer able to 
fallback to jgit.
   
   So you can close this.
   
   P.S.: for checking out Jenkins still uses jgit internally. But I had to 
install native git on the windows VM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Fix assemble source release [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new pull request, #14800:
URL: https://github.com/apache/lucene/pull/14800

   Fixes #14796


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Fix assemble source release [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14800:
URL: https://github.com/apache/lucene/pull/14800#issuecomment-2979154910

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Fix assemble source release [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14800:
URL: https://github.com/apache/lucene/pull/14800#issuecomment-2979163597

   I'll let myself merge this in to proceed with other things. If anybody would 
like to change anything here, please comment on the patch and I'll follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14788:
URL: https://github.com/apache/lucene/pull/14788#issuecomment-2975931132

   Yes, I think this can be relied upon. I don't think it's mentioned anywhere, 
officially, but I think it's the case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Allow turning off gradle/groovy script spotless pass using 'lucene.spotlessGradleScripts' build option [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new pull request, #14791:
URL: https://github.com/apache/lucene/pull/14791

   Fixes #14789. Allows locally skipping the heavy greclipse download and 
costly formatting/validation step. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Allow turning off gradle/groovy script spotless pass using 'lucene.spotlessGradleScripts' build option [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14791:
URL: https://github.com/apache/lucene/pull/14791#issuecomment-2976005719

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] .editorconfig [lucene]

2025-06-16 Thread via GitHub


dsmiley commented on PR #14740:
URL: https://github.com/apache/lucene/pull/14740#issuecomment-2976920502

   > please remove the python completely as we already have a python 
autoformatter and verifier
   
   I think you're missing a point of the value of this.  It's _complementary_ 
with Spotless (or similar).  It helps us write code formatted according to the 
project's standards during the writing/editing process, especially when 
utilizing IDE features that manipulate code (e.g. refactorings, perhaps code 
suggestion completions too).  Consequently, forgetting to run tidy/Spotless can 
be less of an annoyance... pushing more formatting earlier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] org.apache.lucene.search.TestPatienceFloatVectorQuery.testFindAll failed [lucene]

2025-06-16 Thread via GitHub


tteofili commented on issue #14694:
URL: https://github.com/apache/lucene/issues/14694#issuecomment-2976969898

   this is a bit weird, basically we have the correct resulting docs in the 
array, but their sorting order is wrong.
   
   ```
   scoreDocs[2] = {ScoreDoc@4520} "doc=1 score=0.5 shardIndex=-1" <-- <0, 1>
   scoreDocs[1] = {ScoreDoc@4519} "doc=2 score=1.0 shardIndex=-1" <-- <1, 2>
   scoreDocs[0] = {ScoreDoc@4518} "doc=0 score=1.0 shardIndex=-1" <-- <0, 0>
   ```
   
   in fact the query `<0, 0>` should return `<0, 0>` as its nearest neighbor, 
whereas <0, 0> and <1, 2> have a tie with score equals to 1 (and this sounds 
wrong).
   this doesn't seem to depend on the `PatienceKnnVectorQuery`, in fact the 
test doesn't run HNSW search, but exact search (because k >= maxOrd).
   
   by debugging it looks like indexing creates two slices, one with <0, 0> and 
<0, 1> (scored as 1 and 0.5, respectively) and one with <1, 2>, supposedly. 
   this particular seed uses `Lucene99ScalarQuantizedVectorScorer`.
   again, while debugging, it seems that <1, 2> being in a difference slice is 
quantized as <64, 64> and the query is also quantized as <64, 64>, which seems 
wrong.
   I'll keep digging into this.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] .editorconfig [lucene]

2025-06-16 Thread via GitHub


rmuir commented on PR #14740:
URL: https://github.com/apache/lucene/pull/14740#issuecomment-2976483811

   please remove the python completely as we already have a python 
autoformatter and verifier (like spotless, except not slow as hell). We don't 
need conflicting editor configuration around it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Fix intermittent failure of TestDoubleValuesSourceRescorer [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14793:
URL: https://github.com/apache/lucene/pull/14793#issuecomment-2976551678

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


Pulkitg64 commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2977859146

   We are experimenting with large vector indexes, and since (raw unquantized) 
vectors consume significant disk space (4x more than quantized vectors), we 
want to drop the raw vectors from searcher machines. We are currently using 
vector values for below use cases:
   1. Calculating the dot-product scores and return them in search results
   2. Returning the vectors in search results
   3. Vector counting for metrics
   
   For use case 1 we have started to use 
[vectorScorer](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsReader.java#L444)
 which use quantized vectors for computing score so we are good there. For use 
cases 2 and 3, we currently use floatVectorValues using 
[getFloatVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsReader.java#L189)
 but need to switch to quantizedVectorValues since searchers won't have float 
vectors anymore and we are okay in accepting the accuracy loss from 
float-to-byte quantization.
   
   To address these use cases, we have two options:
   * Introduce a new API: getQuantizedVectorValues to access 
quantizedByteVector OR
   * Use our local workaround: Make the 
[QuantizedVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsReader.java#L408)
 class and its members public to directly access quantized vectors
   
   I would like to know your thoughts on whether we should create such an API, 
and if you think the above use cases don't justify a new API, what are your 
thoughts on implementing the workaround solution and pushing it upstream?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


dweiss commented on code in PR #14788:
URL: https://github.com/apache/lucene/pull/14788#discussion_r2150729480


##
build-tools/build-infra/src/main/groovy/lucene.java.core.mrjar.gradle:
##
@@ -32,9 +32,14 @@ configure(project(":lucene:core")) {
   tasks.named("compileMain${jdkVersion}Java").configure {
 def apijar = apijars.file("jdk${jdkVersion}.apijar")
 
+// TODO: this depends on the order of argument configuration...
 int releaseIndex = options.compilerArgs.indexOf("--release")
 options.compilerArgs.removeAt(releaseIndex)
 options.compilerArgs.removeAt(releaseIndex)
+
+// Remove conflicting options for the linter. #14782
+options.compilerArgs.removeAll("-Xlint:options")

Review Comment:
   Thank you!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[I] Fix regression in assembleSourceTgz [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new issue, #14796:
URL: https://github.com/apache/lucene/issues/14796

   ### Description
   
   https://ci-builds.apache.org/job/Lucene/job/Lucene-Artifacts-main/1663/
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Fix intermittent failure of TestDoubleValuesSourceRescorer [lucene]

2025-06-16 Thread via GitHub


ChrisHegarty opened a new pull request, #14793:
URL: https://github.com/apache/lucene/pull/14793

   This commit fixes an intermittent failure of TestDoubleValuesSourceRescorer, 
where the test sometimes randomly selects too few documents to index leading to 
no matches in the search. Failure example:
   
   ```
   org.apache.lucene.search.TestDoubleValuesSourceRescorer > test suite's 
output saved to 
/../lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.search.TestDoubleValuesSourceRescorer.txt,
 copied below:
  > java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for 
length 0
  > at 
__randomizedtesting.SeedInfo.seed([86E2E915B8727B0:A3943384845BA19E]:0)
  > at 
org.apache.lucene.search.TestDoubleValuesSourceRescorer.testBasic(TestDoubleValuesSourceRescorer.java:99)
  > at 
java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  > at java.base/java.lang.reflect.Method.invoke(Method.java:580)
   ...
   ```
   
   I think that bumping the number of docs to index should be sufficient, as I 
see no failures even after several hundreds of thousands runs.
   
   relates #14776


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Inline the gitinfo plugin and remove dependency on carrotsearch's buildinfra [lucene]

2025-06-16 Thread via GitHub


dweiss merged PR #14794:
URL: https://github.com/apache/lucene/pull/14794


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add logical build option groups for the allOptions task [lucene]

2025-06-16 Thread via GitHub


dweiss merged PR #14795:
URL: https://github.com/apache/lucene/pull/14795


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Fix intermittent failure of TestDoubleValuesSourceRescorer [lucene]

2025-06-16 Thread via GitHub


vigyasharma merged PR #14793:
URL: https://github.com/apache/lucene/pull/14793


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Allow turning off gradle/groovy script spotless pass using 'lucene.spotlessGradleScripts' build option [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14791:
URL: https://github.com/apache/lucene/pull/14791#issuecomment-2977323935

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove duplicate -Xlint:options flags. [lucene]

2025-06-16 Thread via GitHub


breskeby commented on code in PR #14788:
URL: https://github.com/apache/lucene/pull/14788#discussion_r2150602765


##
build-tools/build-infra/src/main/groovy/lucene.java.core.mrjar.gradle:
##
@@ -32,9 +32,14 @@ configure(project(":lucene:core")) {
   tasks.named("compileMain${jdkVersion}Java").configure {
 def apijar = apijars.file("jdk${jdkVersion}.apijar")
 
+// TODO: this depends on the order of argument configuration...
 int releaseIndex = options.compilerArgs.indexOf("--release")
 options.compilerArgs.removeAt(releaseIndex)
 options.compilerArgs.removeAt(releaseIndex)
+
+// Remove conflicting options for the linter. #14782
+options.compilerArgs.removeAll("-Xlint:options")

Review Comment:
   I don't have a better idea than you. I can think of a smarter  
ArgumentProvider but no easy fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-16 Thread via GitHub


dweiss commented on issue #14785:
URL: https://github.com/apache/lucene/issues/14785#issuecomment-2977896283

   I'd like to note that we use native git for other things within the build - 
assembling a source tgz, fetching repos for hunspell... I can add jgit support 
for this status fetching but maybe we should just add an expectation that 
native git is available and simplify the build this way?
   
   One thing less to worry about. Let me know how strong you feel we need this 
jgit support, Uwe. Not a big problem - I'm just wondering if it's worth it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Adjust base knn format assert assertOffHeapByteSize [lucene]

2025-06-16 Thread via GitHub


benwtrent opened a new pull request, #14797:
URL: https://github.com/apache/lucene/pull/14797

   assertOffHeapByteSize makes a ton of assumptions and these don't really work 
well for custom formats. 
   
   Changing to `protected` instead of `static` to allow simple overriding by 
custom format testers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Adjust base knn format assert assertOffHeapByteSize [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14797:
URL: https://github.com/apache/lucene/pull/14797#issuecomment-2977901473

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


benwtrent commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2977917618

   > Returning the vectors in search results
   
   Why would you need to do this?
   
   Generally, I would assume that any access to the vector would be "Give me 
what I gave you", and the best we can do with quantized vectors is the 
dequantized vector. 
   
   I don't fully understand how serializing a read-only segment that is missing 
files (e.g. missing the "vec" file), but the format should do the right thing 
and see that the file isn't there and provide an approximate view of the 
floating point vectors.
   
   > Vector counting for metrics
   
   I don't understand what this means really. Just counting how many vectors 
there are? This should be doable via the `FloatVectorValues` interface.
   
   > but need to switch to quantizedVectorValues since searchers won't have 
float vectors anymore and we are okay in accepting the accuracy loss from 
float-to-byte quantization.
   
   Again, I think we should do the nice thing, de-quantize the vectors as the 
user asks for them. 
   
   It should fully satisfy the `FloatVectorValues` API, de-quantizing the 
vectors and indicate that the vector returned is an approximation.
   
   Getting access to the raw quantized bytes is basically useless without all 
the other parameters  that were used to quantized the vector.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Make `pack` methods public for `BigIntegerPoint` and `HalfFloatPoint` [lucene]

2025-06-16 Thread via GitHub


prudhvigodithi commented on PR #14784:
URL: https://github.com/apache/lucene/pull/14784#issuecomment-2977045873

   Thanks Adrien, updated the CHANGES.txt moving to 10.3.0.
   @getsaurabh02


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] .editorconfig [lucene]

2025-06-16 Thread via GitHub


rmuir commented on code in PR #14740:
URL: https://github.com/apache/lucene/pull/14740#discussion_r2149895516


##
.editorconfig:
##
@@ -0,0 +1,918 @@
+# EditorConfig: https://editorconfig.org
+# for consistent code style configuration across editors/IDEs
+
+# top-most EditorConfig file
+root = true
+
+[*]
+charset = utf-8
+end_of_line = lf # matches spotless.gradle insistence on this
+indent_size = 2
+indent_style = space
+insert_final_newline = true
+max_line_length = 100
+trim_trailing_whitespace = true
+ij_continuation_indent_size = 4
+ij_formatter_off_tag = @formatter:off
+ij_formatter_on_tag = @formatter:on
+ij_formatter_tags_enabled = true
+ij_smart_tabs = false
+ij_visual_guides =
+ij_wrap_on_typing = false
+
+[Makefile]
+indent_size = 4
+indent_style = tab
+
+[*.java]
+ij_java_align_consecutive_assignments = false
+ij_java_align_consecutive_variable_declarations = false
+ij_java_align_group_field_declarations = false
+ij_java_align_multiline_annotation_parameters = false
+ij_java_align_multiline_array_initializer_expression = false
+ij_java_align_multiline_assignment = false
+ij_java_align_multiline_binary_operation = false
+ij_java_align_multiline_chained_methods = false
+ij_java_align_multiline_deconstruction_list_components = true
+ij_java_align_multiline_extends_list = false
+ij_java_align_multiline_for = false
+ij_java_align_multiline_method_parentheses = false
+ij_java_align_multiline_parameters = false
+ij_java_align_multiline_parameters_in_calls = false
+ij_java_align_multiline_parenthesized_expression = false
+ij_java_align_multiline_records = false
+ij_java_align_multiline_resources = false
+ij_java_align_multiline_ternary_operation = false
+ij_java_align_multiline_text_blocks = false
+ij_java_align_multiline_throws_list = false
+ij_java_align_subsequent_simple_methods = false
+ij_java_align_throws_keyword = false
+ij_java_align_types_in_multi_catch = true
+ij_java_annotation_new_line_in_record_component = false
+ij_java_annotation_parameter_wrap = off
+ij_java_array_initializer_new_line_after_left_brace = true
+ij_java_array_initializer_right_brace_on_new_line = false
+ij_java_array_initializer_wrap = on_every_item
+ij_java_assert_statement_colon_on_next_line = true
+ij_java_assert_statement_wrap = normal
+ij_java_assignment_wrap = normal
+ij_java_binary_operation_sign_on_next_line = true
+ij_java_binary_operation_wrap = normal
+ij_java_blank_lines_after_anonymous_class_header = 0
+ij_java_blank_lines_after_class_header = 0
+ij_java_blank_lines_after_imports = 1
+ij_java_blank_lines_after_package = 1
+ij_java_blank_lines_around_class = 1
+ij_java_blank_lines_around_field = 0
+ij_java_blank_lines_around_field_in_interface = 0
+ij_java_blank_lines_around_field_with_annotations = 0
+ij_java_blank_lines_around_initializer = 1
+ij_java_blank_lines_around_method = 1
+ij_java_blank_lines_around_method_in_interface = 1
+ij_java_blank_lines_before_class_end = 0
+ij_java_blank_lines_before_imports = 1
+ij_java_blank_lines_before_method_body = 0
+ij_java_blank_lines_before_package = 0
+ij_java_blank_lines_between_record_components = 0
+ij_java_block_brace_style = end_of_line
+ij_java_block_comment_add_space = false
+ij_java_block_comment_at_first_column = true
+ij_java_builder_methods =
+ij_java_call_parameters_new_line_after_left_paren = true
+ij_java_call_parameters_right_paren_on_new_line = false
+ij_java_call_parameters_wrap = on_every_item
+ij_java_case_statement_on_separate_line = true
+ij_java_catch_on_new_line = false
+ij_java_class_annotation_wrap = split_into_lines
+ij_java_class_brace_style = end_of_line
+ij_java_class_count_to_use_import_on_demand = 999
+ij_java_class_names_in_javadoc = 1
+ij_java_deconstruction_list_wrap = normal
+ij_java_do_not_indent_top_level_class_members = false
+ij_java_do_not_wrap_after_single_annotation = false
+ij_java_do_not_wrap_after_single_annotation_in_parameter = false
+ij_java_do_while_brace_force = never
+ij_java_doc_add_blank_line_after_description = true
+ij_java_doc_add_blank_line_after_param_comments = false
+ij_java_doc_add_blank_line_after_return = false
+ij_java_doc_add_p_tag_on_empty_lines = true
+ij_java_doc_align_exception_comments = false
+ij_java_doc_align_param_comments = false
+ij_java_doc_do_not_wrap_if_one_line = true
+ij_java_doc_enable_formatting = true
+ij_java_doc_enable_leading_asterisks = true
+ij_java_doc_indent_on_continuation = true
+ij_java_doc_keep_empty_lines = true
+ij_java_doc_keep_empty_parameter_tag = true
+ij_java_doc_keep_empty_return_tag = true
+ij_java_doc_keep_empty_throws_tag = true
+ij_java_doc_keep_invalid_tags = true
+ij_java_doc_param_description_on_new_line = false
+ij_java_doc_preserve_line_breaks = false
+ij_java_doc_use_throws_not_exception_tag = true
+ij_java_else_on_new_line = false
+ij_java_entity_dd_prefix =
+ij_java_entity_dd_suffix = EJB
+ij_java_entity_eb_prefix =
+ij_java_entity_eb_suffix = Bean
+ij_java_entity_hi_prefix =
+ij_java_entity_hi_suffix = Home
+ij_java_entity_lhi_prefi

[PR] Inline the gitinfo plugin and remove dependency on carrotsearch's buildinfra [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new pull request, #14794:
URL: https://github.com/apache/lucene/pull/14794

   Use direct dependencies on dependencychecks plugins and build options. 
Remove some leftover files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Inline the gitinfo plugin and remove dependency on carrotsearch's buildinfra [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14794:
URL: https://github.com/apache/lucene/pull/14794#issuecomment-2976818462

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add logical build option groups for the allOptions task [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14795:
URL: https://github.com/apache/lucene/pull/14795#issuecomment-2977171162

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Add logical build option groups for the allOptions task [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new pull request, #14795:
URL: https://github.com/apache/lucene/pull/14795

   This adds logical (well) build option groups for the 'allOptions' task. 
Looks like this:
   
   
![image](https://github.com/user-attachments/assets/d32873be-ae04-430a-a00a-ac36bf54)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


benwtrent commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2977492907

   > As for the usefulness of accessing quantized bytes directly - we have 
specific use cases, such as returning the vectors themselves when requested in 
a query.
   
   I would assume the caller would want something akin to the `float` values. 
What would a caller be expected to do with the quantized bytes directly?
   
   
   I am saying that returning the quantized bytes, without knowing all the 
other information (quantized technique, the technique's parameters, etc.) is 
pretty useless. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Correct python release scripts for the new location of base version [lucene]

2025-06-16 Thread via GitHub


dweiss opened a new pull request, #14798:
URL: https://github.com/apache/lucene/pull/14798

   #14786
   
   ### Description
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Correct python release scripts for the new location of base version [lucene]

2025-06-16 Thread via GitHub


dweiss commented on PR #14798:
URL: https://github.com/apache/lucene/pull/14798#issuecomment-2978159953

   Thanks. I think I'll go back to the regexp-scan instead of using the 
javaproperties module. I don't think we import the requirements in github 
workflow that runs the tests... Will return to it tomorrow, I'm done for today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] python: add autofix hint if linter fails [lucene]

2025-06-16 Thread via GitHub


rmuir opened a new pull request, #14799:
URL: https://github.com/apache/lucene/pull/14799

   if the linter fails, often many of the problems can be safely autofixed. 
That's because many of the rules are opinionated / conventional / cosmetic and 
can be annoying to deal with manually.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Correct python release scripts for the new location of base version [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14798:
URL: https://github.com/apache/lucene/pull/14798#issuecomment-2978119425

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Correct python release scripts for the new location of base version [lucene]

2025-06-16 Thread via GitHub


rmuir commented on PR #14798:
URL: https://github.com/apache/lucene/pull/14798#issuecomment-2978151874

   Try a `make reformat`. Not sure what editor you use, but eg for vscode if 
you can install ruff and basedpyright extensions, and enable format-on-save and 
organize-imports-on-save too, you will pretty much never need to deal with 
`make`. Lmk, I can try to document or add eg vscode settings file or whatever 
helps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Correct python release scripts for the new location of base version [lucene]

2025-06-16 Thread via GitHub


rmuir commented on PR #14798:
URL: https://github.com/apache/lucene/pull/14798#issuecomment-2978163695

   For now you can try `make autofix` which is like `make reformat` but will 
also (safely) fix any linter issues it can too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Correct python release scripts for the new location of base version [lucene]

2025-06-16 Thread via GitHub


rmuir commented on PR #14798:
URL: https://github.com/apache/lucene/pull/14798#issuecomment-2978184994

   #14799 to make the messaging better when `make lint` fails.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Fix regression in assembleSourceTgz [lucene]

2025-06-16 Thread via GitHub


dweiss commented on issue #14796:
URL: https://github.com/apache/lucene/issues/14796#issuecomment-2978198994

   Smoke tester also fails with this - which is probably related:
   ```
   
   GIT rev: 07013dcf0e9d4f927888c1dd7bfee3753c0f8750
   
   
   2025-06-16 21:21:27.322717: RUN: ./gradlew --stacktrace --no-daemon 
assembleRelease -Dversion.release=11.0.0 -Pvalidation.git.failOnModified=false
   Downloading gradle-wrapper.jar from 
https://raw.githubusercontent.com/gradle/gradle/v8.14.0/gradle/wrapper/gradle-wrapper.jar
   Generating gradle.properties
   Downloading https://services.gradle.org/distributions/gradle-8.14-bin.zip
   
.10%.20%.30%.40%.50%.60%.70%.80%.90%..100%
   
   Welcome to Gradle 8.14!
   
   Here are the highlights of this release:
- Java 24 support
- GraalVM Native Image toolchain selection
- Enhancements to test reporting
- Build Authoring improvements
   
   For more details see https://docs.gradle.org/8.14/release-notes.html
   
   To honour the JVM settings for this build a single-use Daemon process will 
be forked. For more on this, please refer to 
https://docs.gradle.org/8.14/userguide/gradle_daemon.html#sec:disabling_the_daemon
 in the Gradle documentation.
   Daemon will be stopped at the end of the build 
   > Task :build-infra:extractPluginRequests
   > Task :build-infra:generatePluginAdapters
   
   > Task :build-infra:compileJava
   Note: 
/home/runner/work/lucene/lucene/build-tools/build-infra/src/main/java/org/apache/lucene/gradle/Checksum.java
 uses or overrides a deprecated API.
   Note: Recompile with -Xlint:deprecation for details.
   
   > Task :build-infra:compileGroovy NO-SOURCE
   > Task :build-infra:compileGroovyPlugins
   > Task :build-infra:pluginDescriptors
   > Task :build-infra:processResources
   > Task :build-infra:classes
   > Task :build-infra:jar
   
   [Incubating] Problems report is available at: 
file:///home/runner/work/lucene/lucene/build/reports/problems/problems-report.html
   
   FAILURE: Build failed with an exception.
   
   * Where:
   Build file '/home/runner/work/lucene/lucene/build.gradle' line: 68
   
   * What went wrong:
   An exception occurred applying plugin request [id: 'lucene.documentation']
   > Failed to apply plugin 'lucene.documentation'.
  > Failed to query the value of property 'value'.
 > Failed to query the value of property 'defaultValue'.
> class org.codehaus.groovy.runtime.GStringImpl cannot be cast to 
class java.lang.String (org.codehaus.groovy.runtime.GStringImpl is in unnamed 
module of loader org.gradle.internal.classloader.VisitableURLClassLoader 
@19e1023e; java.lang.String is in module java.base of loader 'bootstrap')
   
   * Try:
   > Run with --info or --debug option to get more log output.
   > Get more help at https://help.gradle.org.
   
   * Exception is:
   org.gradle.api.plugins.InvalidPluginException: An exception occurred 
applying plugin request [id: 'lucene.documentation']
at 
org.gradle.plugin.use.internal.DefaultPluginRequestApplicator.exceptionOccurred(DefaultPluginRequestApplicator.java:183)
at 
org.gradle.plugin.use.internal.DefaultPluginRequestApplicator.access$400(DefaultPluginRequestApplicator.java:54)
at 
org.gradle.plugin.use.internal.DefaultPluginRequestApplicator$ApplyAction.apply(DefaultPluginRequestApplicator.java:164)
at 
org.gradle.plugin.use.internal.DefaultPluginRequestApplicator.lambda$applyPlugins$1(DefaultPluginRequestApplicator.java:134)
at 
org.gradle.plugin.use.internal.DefaultPluginRequestApplicator.applyPlugins(DefaultPluginRequestApplicator.java:134)
at 
org.gradle.configuration.DefaultScriptPluginFactory$ScriptPluginImpl.apply(DefaultScriptPluginFactory.java:123)
at 
org.gradle.configuration.BuildOperationScriptPlugin$1.run(BuildOperationScriptPlugin.java:68)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner$1.execute(DefaultBuildOperationRunner.java:30)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner$1.execute(DefaultBuildOperationRunner.java:27)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:67)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner$2.execute(DefaultBuildOperationRunner.java:60)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:167)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner.execute(DefaultBuildOperationRunner.java:60)
at 
org.gradle.internal.operations.DefaultBuildOperationRunner.run(DefaultBuildOperationRunner.java:48)
at 
org.gradle.configuration.BuildOperationScriptPlugin.lambda$apply$0(BuildOperationScriptPlugin.java:65)
at 
org.gradle.internal.code.DefaultUserCodeApplicationContext.apply(Defaul

Re: [I] org.apache.lucene.search.TestPatienceFloatVectorQuery.testFindAll failed [lucene]

2025-06-16 Thread via GitHub


benwtrent commented on issue #14694:
URL: https://github.com/apache/lucene/issues/14694#issuecomment-2977130359

   @tteofili it depends on how many vectors are in the segment. Given the 
`Lucene99ScalarQuantizedVectorScorer` scorer, I have seen issues where uniform 
vectors cause strange scoring behavior. This is because min/max end up being 
effectively equal, no matter the confidence interval.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-16 Thread via GitHub


Pulkitg64 commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2977454253

   Thanks @benwtrent for the quick review and comments.
   
   Regarding your comment about how this relates to issue #13158 - I agree in a 
way this PR doesn't directly help create a "read-only" index as mentioned in 
the issue. Let me clarify the motivation:
   
   This PR addresses a scenario where:
   * Raw (unquantized) vectors are removed from the index since they aren't 
needed for searching
   * The architecture has searcher and writer running on separate machines
   
   Currently, there's no way to directly access quantized vectors - we can only 
access raw vectors. But if raw vectors are dropped, this causes errors. This PR 
adds methods to access ByteQuantizedVectors in such cases.
   
   As for the usefulness of accessing quantized bytes directly - we have 
specific use cases, such as returning the vectors themselves when requested in 
a query.
   
   Please let me know your thoughts.
   
   Regarding accessing quantized vectors directly - we could also consider 
using the 
[QuantizedVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsReader.java#L408)
 class, which is currently returned by the 
[getFloatVectorValues](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsReader.java#L189)
 method. While this class wraps both raw and quantized vectors, its members are 
private, preventing direct access to the quantized vectors like we're doing in 
this PR.
   
   Would it make more sense to make the relevant members public in 
QuantizedVectorValues rather than adding getQuantizedVectorValues to LeafReader?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] spotlessGradleScripts doesn't work with whitespace-paths on Windows [lucene]

2025-06-16 Thread via GitHub


uschindler commented on issue #14787:
URL: https://github.com/apache/lucene/issues/14787#issuecomment-2978315117

   It is working, but spams the log with the above stack traces. I tested it by 
modifying a Gradle file. It was reformatted successfully.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-16 Thread via GitHub


dsmiley commented on issue #14785:
URL: https://github.com/apache/lucene/issues/14785#issuecomment-2978673604

   Why revert back to jgit?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] python: add autofix hint if linter fails [lucene]

2025-06-16 Thread via GitHub


rmuir merged PR #14799:
URL: https://github.com/apache/lucene/pull/14799


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] A multi-tenant ConcurrentMergeScheduler [lucene]

2025-06-16 Thread via GitHub


yaser-aj commented on issue #13883:
URL: https://github.com/apache/lucene/issues/13883#issuecomment-2978379030

   Me, @lukewilner, @atharvkashyap, and @N624-debu are students from Carnegie 
Mellon University, and we’ll be working on this issue as part of a mentored 
summer course focused on collaboration in open-source software. Our mentors are 
@mikemccand and @vigyasharma. We’ll be drafting a plan and submitting PRs over 
the next few weeks. Looking forward to collaborating!
   
   **Our understanding of the problem:**
   
   Every `IndexWriter` within a running JVM initiates one 
`ConcurrentMergeScheduler` object that, based on the selected `MergePolicy`, 
uses available resources to merge segments into a single `Merge` object. The 
problem is that when there are multiple `IndexWriter` objects, different 
`ConcurrentMergeScheduler` objects are initiated and all of them blindly use 
available compute resources for the running JVM, without regard to each other. 
This causes excessive resources (RAM, CPU cores, and I/O resources) usage, way 
beyond what the user have allocated for merging.
   
   There has to be one `MultiTenantConcurrentMergeScheduler` object that 
organizes how all `ConcurrentMergeScheduler` objects operate and divide 
resources wisely across them. It should handle addition and deletion of 
`ConcurrentMergeScheduler` objects on the go, optimally without the need to 
restart all `ConcurrentMergeScheduler` objects every time the number of 
`ConcurrentMergeScheduler` objects changes.
   
   **Thinking out loud:**
   
   Maybe we can use 
[setMaxMergesAndThreads](https://javadoc.io/static/org.apache.lucene/lucene-core/10.2.1/org/apache/lucene/index/ConcurrentMergeScheduler.html#setMaxMergesAndThreads(int,int))
 inside the singleton `MultiTenantConcurrentMergeScheduler` object while merges 
are happening across all `ConcurrentMergeScheduler` objects. This update can 
happen whenever a new `ConcurrentMergeScheduler` is added or deleted. It should 
wisely divide the allocated resources across all active 
`ConcurrentMergeScheduler` objects, giving more merge threads to needy 
`ConcurrentMergeScheduler` objects and less to no threads at all to the idle 
`ConcurrentMergeScheduler` objects. We have to come up with an efficient way to 
decide how to distribute threads based on (1) the continuously changing needs 
of each `ConcurrentMergeScheduler` object and (2) number of active 
`ConcurrentMergeScheduler` objects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Implement `ConstantScoreScorer#nextDocsAndScores` [lucene]

2025-06-16 Thread via GitHub


HUSTERGS commented on PR #14772:
URL: https://github.com/apache/lucene/pull/14772#issuecomment-2978687251

   @gf2121 Thanks for your suggestion!
   > I wonder if we can benchmark the following implementation to confirm the 
source of speedup?
   
   I'v done the benchmark according to your suggestion, here is the result 
under identical setup
   ```
TaskQPS baseline  StdDevQPS my_modified_version  StdDev 
   Pct diff p-value
 CombinedOrHighHigh6.43  (3.1%)6.33  
(3.5%)   -1.6% (  -7% -5%) 0.136
  CountTerm 6286.39  (5.1%) 6202.67  
(5.0%)   -1.3% ( -10% -9%) 0.403
   SloppyPhrase1.48  (6.1%)1.46  
(6.1%)   -1.3% ( -12% -   11%) 0.504
 IntSet  391.57  (4.3%)  386.72  
(3.6%)   -1.2% (  -8% -6%) 0.324
 TermDTSort  191.55  (4.3%)  189.30  
(4.1%)   -1.2% (  -9% -7%) 0.374
Respell   43.91  (3.4%)   43.40  
(2.6%)   -1.2% (  -6% -4%) 0.224
   CombinedTerm   13.53  (2.9%)   13.38  
(4.6%)   -1.2% (  -8% -6%) 0.341
  CombinedOrHighMed   24.60  (4.4%)   24.37  
(5.7%)   -0.9% ( -10% -9%) 0.572
Prefix3   96.37  (3.2%)   95.64  
(3.6%)   -0.8% (  -7% -6%) 0.481
 Fuzzy2   44.58  (3.3%)   44.26  
(3.9%)   -0.7% (  -7% -6%) 0.533
FilteredPrefix3   89.75  (3.5%)   89.18  
(3.5%)   -0.6% (  -7% -6%) 0.567
 DismaxTerm  597.24  (4.8%)  593.46  
(4.1%)   -0.6% (  -9% -8%) 0.653
 IntNRQ   48.68  (2.1%)   48.38  
(1.9%)   -0.6% (  -4% -3%) 0.334
 FilteredIntNRQ   48.35  (1.9%)   48.08  
(1.9%)   -0.6% (  -4% -3%) 0.358
  TermMonthSort 2333.01  (2.7%) 2321.17  
(3.3%)   -0.5% (  -6% -5%) 0.597
AndHighHigh   25.77  (9.5%)   25.64  
(9.0%)   -0.5% ( -17% -   19%) 0.868
 Fuzzy1   49.14  (3.4%)   48.95  
(3.9%)   -0.4% (  -7% -7%) 0.730
   Term  553.88  (7.3%)  551.72  
(6.4%)   -0.4% ( -13% -   14%) 0.857
 OrHighRare  111.67  (7.0%)  111.30  
(6.9%)   -0.3% ( -13% -   14%) 0.879
 FilteredOr2Terms2StopWords   63.72  (2.6%)   63.51  
(2.9%)   -0.3% (  -5% -5%) 0.709
CountOrMany5.91  (3.2%)5.89  
(3.5%)   -0.3% (  -6% -6%) 0.765
TermB1M  553.86  (7.2%)  552.13  
(6.4%)   -0.3% ( -12% -   14%) 0.885
   DismaxOrHighHigh   43.53  (5.5%)   43.40  
(5.5%)   -0.3% ( -10% -   11%) 0.861
 AndHighMed   64.98  (7.8%)   64.78  
(8.1%)   -0.3% ( -14% -   16%) 0.904
DismaxOrHighMed   62.12  (4.6%)   61.94  
(5.5%)   -0.3% (  -9% -   10%) 0.850
 OrMany5.32  (6.5%)5.30  
(8.1%)   -0.3% ( -13% -   15%) 0.902
 FilteredOrHighHigh   17.39  (1.8%)   17.35  
(2.0%)   -0.2% (  -3% -3%) 0.692
  TermTitleSort   62.78  (6.8%)   62.63  
(6.3%)   -0.2% ( -12% -   13%) 0.908
  FilteredAnd3Terms  126.50  (2.5%)  126.23  
(3.0%)   -0.2% (  -5% -5%) 0.813
FilteredOrStopWords   10.63  (1.7%)   10.61  
(2.3%)   -0.2% (  -4% -3%) 0.764
  TermB1M1P  552.80  (7.2%)  551.81  
(6.4%)   -0.2% ( -12% -   14%) 0.934
   CountAndHighHigh   60.70  (1.4%)   60.60  
(2.5%)   -0.2% (  -4% -3%) 0.797
   FilteredOr3Terms   57.09  (1.9%)   57.00  
(2.2%)   -0.2% (  -4% -3%) 0.809
Term10K  553.61  (7.4%)  552.84  
(6.5%)   -0.1% ( -13% -   14%) 0.950
 OrHighHigh   24.40  (9.2%)   24.36  
(8.5%)   -0.1% ( -16% -   19%) 0.961
CountFilteredPhrase   11.39  (3.0%)   11.38  
(3.7%)   -0.1% (  -6% -6%) 0.942
   FilteredTerm   83.08  (2.2%)   83.02  
(2.7%)   -0.1% (  -4% -4%) 0.922
  FilteredOrHighMed   51.44  (2.4%)   51.42  
(2.7%)   -0.0% (  -5% -5%) 0.957
  TermDayOfYearSort  340.99  (2.7%)  340.87  
(2.8%)   -0.0% (  -5% -5%) 0.967
   SpanNear3.07  (4.1%

Re: [PR] python: add autofix hint if linter fails [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14799:
URL: https://github.com/apache/lucene/pull/14799#issuecomment-2978185513

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Merge PostingsEnum and ImpactsEnum. [lucene]

2025-06-16 Thread via GitHub


github-actions[bot] commented on PR #14716:
URL: https://github.com/apache/lucene/pull/14716#issuecomment-2978548505

   This PR has not had activity in the past 2 weeks, labeling it as stale. If 
the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you 
for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Revert back to jgit for collecting git status [lucene]

2025-06-16 Thread via GitHub


dweiss commented on issue #14785:
URL: https://github.com/apache/lucene/issues/14785#issuecomment-2979061860

   @uschindler asked for it - simpler to manage on Windows vm boxes, I guess. I 
think we can make it an option to use native or jgit... but I'd rather simplify 
than complicate so I'll wait for Uwe to respond if he cares that much (since he 
already installed git on policeman jenkins).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org