[PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


HUSTERGS opened a new pull request, #14806:
URL: https://github.com/apache/lucene/pull/14806

   ### Description
   This PR propose to utilize `docIdRunEnd` on `ReqExclBulkScorer`, so we can 
jump faster on `MUST_NOT` clause
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14806:
URL: https://github.com/apache/lucene/pull/14806#issuecomment-2982992815

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2983643568

   The reason it's never up to date with a custom formatting step:
   https://github.com/diffplug/spotless/issues/2516


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2983964771

   It is wicked fast indeed (uses all cores).
   ```
   dweiss@dweiss-beast:~/work/apache/lucene$ time ast-grep scan  . -r 
/home/dweiss/tmp/no-wildcard-imports.sg.yaml
   ...
   real 0m0.738s
   user 0m6.995s
   sys  0m0.269s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


gf2121 commented on code in PR #14806:
URL: https://github.com/apache/lucene/pull/14806#discussion_r2154439718


##
lucene/core/src/java/org/apache/lucene/search/ReqExclBulkScorer.java:
##
@@ -57,7 +57,10 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
 exclDoc = exclApproximation.advance(upTo);
   }
   if (exclDoc == upTo) {
-if (exclTwoPhase == null || exclTwoPhase.matches()) {
+if (exclTwoPhase == null) {
+  // from upTo to docIdRunEnd() are excluded, so we scored up to 
docIdRunEnd()
+  upTo = exclApproximation.docIDRunEnd();
+} else if (exclTwoPhase.matches()) {

Review Comment:
   Is `upTo = Math.max(upTo + 1, exclTwoPhase.docIdRunEnd())` correct here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


gf2121 commented on code in PR #14806:
URL: https://github.com/apache/lucene/pull/14806#discussion_r2154454156


##
lucene/core/src/java/org/apache/lucene/search/ReqExclBulkScorer.java:
##
@@ -57,7 +57,10 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
 exclDoc = exclApproximation.advance(upTo);
   }
   if (exclDoc == upTo) {
-if (exclTwoPhase == null || exclTwoPhase.matches()) {
+if (exclTwoPhase == null) {
+  // from upTo to docIdRunEnd() are excluded, so we scored up to 
docIdRunEnd()
+  upTo = exclApproximation.docIDRunEnd();

Review Comment:
   Do we need to check `exclApproximation.docID() != 
DocIdSetIterator#NO_MORE_DOCS`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Make `pack` methods public for `BigIntegerPoint` and `HalfFloatPoint` [lucene]

2025-06-18 Thread via GitHub


jpountz merged PR #14784:
URL: https://github.com/apache/lucene/pull/14784


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add prefetching for terms dict in doc values [lucene]

2025-06-18 Thread via GitHub


jpountz commented on PR #14773:
URL: https://github.com/apache/lucene/pull/14773#issuecomment-2983993101

   Doing this is a bit less obvious in my mind since terms dictionaris are 
allowed to have a random access pattern, when doc-value iterators are required 
to be consumed in order?
   
   Somewhat separately, I've been wondering if we need to APIs, one for saying 
"I'm going to use this field" and pass the full range of data, and another one 
to say "I'm going to use this specific range of bytes", which would be what 
`IndexInput#prefetch` does today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add prefetching for terms dict in doc values [lucene]

2025-06-18 Thread via GitHub


easyice closed pull request #14773: Add prefetching for terms dict in doc values
URL: https://github.com/apache/lucene/pull/14773


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add prefetching for terms dict in doc values [lucene]

2025-06-18 Thread via GitHub


easyice commented on PR #14773:
URL: https://github.com/apache/lucene/pull/14773#issuecomment-2984815277

   Thanks for the explanation, You're right, I'll close this PR. I feel an API 
to prefetch full range of data could be useful, e.g. for files like `.tip`, 
which are relatively small but have a significant impact on read performance?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14806:
URL: https://github.com/apache/lucene/pull/14806#issuecomment-2983425250

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14806:
URL: https://github.com/apache/lucene/pull/14806#issuecomment-2983430347

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


HUSTERGS commented on code in PR #14806:
URL: https://github.com/apache/lucene/pull/14806#discussion_r2154727636


##
lucene/core/src/java/org/apache/lucene/search/ReqExclBulkScorer.java:
##
@@ -57,7 +57,10 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
 exclDoc = exclApproximation.advance(upTo);
   }
   if (exclDoc == upTo) {
-if (exclTwoPhase == null || exclTwoPhase.matches()) {
+if (exclTwoPhase == null) {
+  // from upTo to docIdRunEnd() are excluded, so we scored up to 
docIdRunEnd()
+  upTo = exclApproximation.docIDRunEnd();
+} else if (exclTwoPhase.matches()) {

Review Comment:
   I think it's correct here, and I run lucene test locally, everything looks 
fine. What I'm concerned here is that only `DocValuesRangeIterator` implement 
it's own `docIdRunEnd` method for now, I'm not sure whether adding this will 
hurt the performance or not



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2983757840

   Error prone has a rule for this, actually -
   ```
   diff --git 
a/build-tools/build-infra/src/main/groovy/lucene.validation.error-prone.gradle 
b/build-tools/build-infra/src/main/groovy/lucene.validation.error-prone.gradle
   index 49a4a067af6..1070346a19f 100644
   --- 
a/build-tools/build-infra/src/main/groovy/lucene.validation.error-prone.gradle
   +++ 
b/build-tools/build-infra/src/main/groovy/lucene.validation.error-prone.gradle
   @@ -331,6 +331,7 @@ allprojects { prj ->
  '-Xep:UnsafeWildcard:ERROR',
  '-Xep:UnusedAnonymousClass:ERROR',
  '-Xep:UnusedCollectionModifiedInPlace:ERROR',
   +  "-Xep:WildcardImport:ERROR",
  '-Xep:VarTypeName:ERROR',
  // '-Xep:WrongOneof:OFF', // we don't use protobuf
  '-Xep:XorPower:ERROR',
   ```
   I know we only run it on a CI but ast-grep requires an external tool, so 
maybe it's a simpler/ better alternative? This is what the output looks like:
   
   ```
   > Task :lucene:core:compileJava
   
/home/dweiss/mounts/ocean/work/apache/lucene/lucene/core/src/java/org/apache/lucene/internal/hppc/IntIntHashMap.java:20:
 error: [WildcardImport] Wildcard imports, static or otherwise, should not be 
used
   import static org.apache.lucene.internal.hppc.HashContainers.*;
   ^
   (see 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-18 Thread via GitHub


benwtrent commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2983852051

   @Pulkitg64 I think the format has access to the quantized vectors. I am 
saying we shouldn't add a new `getQuantizedVEctors` to the leaf or kNN APIs. 
   
   When the `vec` file isn't present, the format should still return a 
`FloatVectorValues`, and when folks call `float[] getVector(int ord)`, they get 
the de-quantized representation of the vector.
   
   This should require no new public facing APIs anywhere.
   
   Here is the idea.
   
   ```
   public FloatVectorValues getFloatVectorValues(String field) throws 
IOException {
   return OffHeapQuantizedFloatVectorValues.load(...);
   }
   
   ...
   
   static class OffHeapQuantizedFloatVectorValues extends FloatVectorValues {
 
 final float[] scratch;
 int curOrd = -1;
   
 ...
   
 public  float[] vectorValue(int ord) throws IOException {
   if (ord == curOrd) return scratch;
   dequantize(ord);
   curOrd = ord;
   return scratch;
 }
   
 private void dequantize(int ord) throws IOException {
   //read quantized values and parameters
   // dequantize into scratch
 }
 
   }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2983891154

   yeah, that's fine.
   
   we can discuss it separately maybe. Elsewhere I use the tool a lot, replace 
entire slow linters with it! It is wicked fast...
   
   there's a strange cultural thing at play here, with all the other 
programming languages, "external tools" written in rust are heavily adopted, 
makes everything so fast on the dev side. It isn't just python here but e.g. 
microsoft rewriting typescript in golang, bun, etc.
   
   with java, and only java, there's this strange NIH cultural thing where all 
tooling must itself also be written in java. I don't understand it, but it 
makes everything really slow :)
   
   The recent "git" is a great example, insanely fast tool designed by 
literally, the best, yet there's pressure to use a crappy java substitute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2983916752

   I have noticed this as well - in the js world most (fast) things use a rust 
backend.
   
   Anyway, I'll fiddle with this a bit, it's not a high priority thing. Maybe 
make it optionally available, if the tool is present. I'll see.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2982832608

   I'll take a look, sure. I'll also try to figure out why the heck spotless 
isn't behaving the way it should.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-18 Thread via GitHub


Pulkitg64 commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2984235394

   Oh I understand now what you meant in your comments. This approach is much 
more cleaner and doesn't require any new API addition. Will raise a new 
revision in some time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] .editorconfig [lucene]

2025-06-18 Thread via GitHub


dsmiley commented on PR #14740:
URL: https://github.com/apache/lucene/pull/14740#issuecomment-2984277753

   The existing editorconfig was redundant with information in this 
editorconfig, so I simply removed it.  I'll merge this tonight.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Remove unused delGen() method from FieldTermIterator [lucene]

2025-06-18 Thread via GitHub


romseygeek opened a new pull request, #14807:
URL: https://github.com/apache/lucene/pull/14807

   This was used in the past to correctly order terms in 
MergedPrefixCodedTermsIterator, 
   but that class was removed long ago so the method is no longer needed.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Remove unused delGen() method from FieldTermIterator [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14807:
URL: https://github.com/apache/lucene/pull/14807#issuecomment-2984306433

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2984368443

   I've added a task that runs ast-grep on the CI and locally, if you define 
```lucene.tool.ast-grep``` property (for example, 
```lucene.tool.ast-grep=ast-grep``` or permanently, in local build options or 
gradle properties).
   
   
![image](https://github.com/user-attachments/assets/5b212531-fa3c-4256-a45a-178542de00b4)
   
   Opens up the possibility to add new rules too (gradle/validation/ast-grep).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2984370238

   I'll try to finalize this patch later today, have to leave now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


HUSTERGS commented on code in PR #14806:
URL: https://github.com/apache/lucene/pull/14806#discussion_r2154711602


##
lucene/core/src/java/org/apache/lucene/search/ReqExclBulkScorer.java:
##
@@ -57,7 +57,10 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
 exclDoc = exclApproximation.advance(upTo);
   }
   if (exclDoc == upTo) {
-if (exclTwoPhase == null || exclTwoPhase.matches()) {
+if (exclTwoPhase == null) {
+  // from upTo to docIdRunEnd() are excluded, so we scored up to 
docIdRunEnd()
+  upTo = exclApproximation.docIDRunEnd();

Review Comment:
   If I understand correctly, `exclApproximation.docID()` should equals 
`exclDoc`, which is equal to `upTo` (under the `if` clause ), and `upTo` is 
less than `max`, so `exclApproximation.docID()` should never be 
`DocIdSetIterator#NO_MORE_DOCS`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues [lucene]

2025-06-18 Thread via GitHub


Pulkitg64 commented on PR #14792:
URL: https://github.com/apache/lucene/pull/14792#issuecomment-2983706927

   Thanks @benwtrent @msokolov 
   
   > Again, I think we should do the nice thing, de-quantize the vectors as the 
user asks for them.
   
   Sorry, I am surely missing something. But if we can't access the quantized 
vector then how can we do this? We still need byte vector to rehydrate float 
vectors, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2984212372

   In my builds, I install such tools via `pip` or `npm` package managers. So 
it's no different than downloading jar and using that, just faster. Just food 
for thought.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2984219215

   The python linting, formatting, type checking in dev tools is an example. 
None of those are written in the python programming language.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Compression cache of numeric docvalues [lucene]

2025-06-18 Thread via GitHub


rmuir commented on issue #14803:
URL: https://github.com/apache/lucene/issues/14803#issuecomment-2985347971

   The advantage of letting a filesystem such as zfs (which was designed to do 
exactly this), is that it is integrated in the correct place and operating 
system caches work as expected.
   
   It is best to let the OS handle the caching, it will do a better job. Lucene 
caching the docvalues to me is just a step backwards to "fieldcache" which did 
not work well and caused a lot of operational pain.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Add an optional bandwidth cap to `TieredMergePolicy`? [lucene]

2025-06-18 Thread via GitHub


nipunbatra8 commented on issue #14148:
URL: https://github.com/apache/lucene/issues/14148#issuecomment-2985361444

   Hello, I will be diving into this issue for my summer internship. Looking 
forward to discussing more in the near future and contributing to this project!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985443802

   Another related idea here is to prevent any IDE from automatically adding 
the *'s we don't want.
   
   Eclipse's default limit is 99 imports before it starts doing this... of 
course user can change, and some rare files might exceed this. there's a way to 
bump it to huge value (e.g. Integer.MAX_VALUE) in our eclipse settings, i 
think, but i don't have a concrete solution: 
https://github.com/eclipse-jdtls/eclipse.jdt.ls/blob/0b3651f030e2e68e43019268f35880a07aa7a8c5/org.eclipse.jdt.ls.core/src/org/eclipse/jdt/ls/core/internal/preferences/Preferences.java#L345-L367
   
   I don't know how intellij behaves here or if we can prevent similar 
behavior. All this stuff can be followups, really, I'm just brainstorming.
   
   Thanks for the ast-grep though!!! Gives me a way to help speed up the checks 
:) I will leave a few comments, not review ones intended to block the change. I 
can help out with followup PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155360626


##
gradle/validation/ast-grep/sgconfig.yml:
##
@@ -0,0 +1,2 @@
+ruleDirs:
+  - ./rules

Review Comment:
   I've found the built-in self-test to be essential for having and keeping 
correct rules. It takes milliseconds and they are easy to write.
   
   ```suggestion
 - ./rules
   testConfigs:
   - testDir: ./tests
   ```
   
   I run these with `ast-grep test --skip-snapshot-tests` before the `ast-grep 
scan`. 
   
   For a rule like this one, I would add tests (untested!) like this: 
`tests/errorprone.yml` (which would test rules in `rules/errorprone.yml`. I 
organize them this way with `---` yaml doc separator between the rules in both 
files. For multiline tests, I use yaml literal style (`|`): 
https://yaml.org/spec/1.2-old/spec.html#id2795688
   
   ```yaml
   ---
   # test wildcard import detector
   id: wildcard-import-not-allowed
   valid:
 - import foo.bar.Baz;
 - import static foo.bar.Baz;
   invalid:
 - import foo.bar.*;
 - import static foo.bar.*;
   ---
   # test for another error prone rule
   id: some-other-errorprone-rule
   invalid: 
 - |
   if (x.equals(null) {
   }
 - |
   foo = x.equals(null) ? 1 : 2;
   # ...
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Add support for running ast-grep rules [lucene]

2025-06-18 Thread via GitHub


dweiss opened a new pull request, #14808:
URL: https://github.com/apache/lucene/pull/14808

   This is a spinoff from #14804. We should allow running ```ast-grep``` 
(https://ast-grep.github.io/) rules as validation checks. This is a powerful 
tool (and fast). 
   
   For now, running ast-grep is optional. You need to install ast-grep manually 
and then set this build option to the location (or name) of of the ast-grep 
tool:
   ```
   lucene.tool.ast-grep=ast-grep
   ```
   I recommend doing this in ```build-options.local.properties```. If this 
build option is _not set_, the entire task (ruleset) is ignored and a warning 
is printed.
   
   I've set up CI jobs on github to install and run ast-grep as part of pull 
requests and on-commit checks for now.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add support for running ast-grep rules [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14808:
URL: https://github.com/apache/lucene/pull/14808#issuecomment-2985485459

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985485780

   I've forked the ast-grep thing into a separate PR - 
https://github.com/apache/lucene/pull/14808. I think we can merge that one in, 
then I can apply a smaller patch for wildcards, using your rule.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155370086


##
gradle/validation/ast-grep/sgconfig.yml:
##
@@ -0,0 +1,2 @@
+ruleDirs:
+  - ./rules

Review Comment:
   > I run these with ast-grep test --skip-snapshot-tests before the ast-grep 
scan.
   
   We can integrate this check into gradle as well. These are nice indeed - 
I've seen the docs on the tool. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] build: replace invalidJavaOnlyPatterns source-patterns with ast-grep [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14812:
URL: https://github.com/apache/lucene/pull/14812#issuecomment-2986074641

   This is just starting small. I really want to next clean up some of the rat 
interaction in this file, especially given the changing APIs and 
thread-unsafeness that makes the gradle code complex.
   
   A lot of the logic around license/headers can be done here instead, 
especially since it isn't a "legal" check (rat-sources does that). I already 
use queries to match these headers in my editor and hide, e.g.:
   ```scheme
   ;; fold and initially close license headers
   (program
 (block_comment) @fold @foldclose
 (#lua-match? @foldclose "^/[*%s]*Licensed to the Apache.*"))
   ```
   We can do similar stuff with yaml rules and ensure licenses aren't javadoc, 
come before package declaration, whatever all is going on here, and simplify 
the build.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155429690


##
gradle/validation/ast-grep/sgconfig.yml:
##
@@ -0,0 +1,2 @@
+ruleDirs:
+  - ./rules

Review Comment:
   I've added it, nice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add support for running ast-grep rules [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14808:
URL: https://github.com/apache/lucene/pull/14808#issuecomment-2985574748

   Thank you @dweiss, I was just slow, was testing it out. I set the property 
locally, and added rule and test:
   ```yaml
   ---
   # yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/rule.json
   id: java-syntax
   language: java
   rule:
 kind: ERROR
   message: Syntax Error
   severity: error
   ```
   
   ```yaml
   ---
   # test the parser is working
   id: java-syntax
   valid:
 - x = 1;
   ```
   
   I introduced bogus syntax to a file and the build then failed quickly on the 
problem (it is just an example and we probably should not do this):
   ```
   
   > Task :testAstGrepRules
   Running 1 tests
   
   --- Case Details ---
   PASS java-syntax  .
   
   test result: ok. 1 passed; 0 failed;
   
   > Task :applyAstGrepRules
   error[java-syntax]: Syntax Error
   ┌─ 
lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java:145:50
   │
   145 │   public List getLeafContexts([) {
   │  ^
   
   Error: 1 error(s) found in code.
   Help: Scan succeeded and found error level diagnostics in the codebase.
   
   
   > Task :applyAstGrepRules FAILED
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155451618


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   I can see pointing at the style guide, but as far as being practical, i 
wonder if we can name the file errorprone.yml and name the checks accordingly 
(e.g. `WildcardImport` and link to their 
https://errorprone.info/bugpattern/WildcardImport url. 
   
   It is just how i've been organizing them, after seeing others use similar 
approach. I also think the errorprone URL is more helpful to end-user in fixing 
such issues (specs are more difficult to read), but that's just me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985617584

   Ok. I think this one is ready.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] error-prone: replace ComparingThisWithNull with ast-grep rule [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14810:
URL: https://github.com/apache/lucene/pull/14810#issuecomment-2985745189

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] error-prone: replace ComparingThisWithNull with ast-grep rule [lucene]

2025-06-18 Thread via GitHub


rmuir opened a new pull request, #14810:
URL: https://github.com/apache/lucene/pull/14810

   This is just an example of a dead-simple check to cross off the list.
   
   organize the error-prone rules into an errorprone.yml file with matching 
linter IDs.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Correct rat source patterns to include build-infra sources [lucene]

2025-06-18 Thread via GitHub


dweiss merged PR #14809:
URL: https://github.com/apache/lucene/pull/14809


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985386981

   Thanks for playing with it. I would be happy to try to write ast-grep rules 
to try to replace some of the slower error-prone checks, or whatever else 
(nocommit checks etc).
   
   We could install the package via pip or npm in the GH actions to support the 
checker. At just a glance, many of the rules are basically trivial e.g. 
https://errorprone.info/bugpattern/EqualsNull. 
   
   From what I remember, each check has a high overhead for errorprone which I 
don't believe is the case for ast-grep. I know I'm oversimplifying it, but I 
assume each file is parsed only once and then all rules run similar to 
https://tree-sitter.github.io/tree-sitter/using-parsers/queries/1-syntax.html, 
but without the s-expressions :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] error-prone: implement BanClassLoader with forbidden-apis instead [lucene]

2025-06-18 Thread via GitHub


rmuir opened a new pull request, #14811:
URL: https://github.com/apache/lucene/pull/14811

   This rule is attempting to ban dangerous usages of ClassLoader which could 
be security hazard (IMO a good idea). forbidden APIs is a good tool for 
banning, just like we use forbidden-apis to ban JNDI and other things.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] error-prone: implement BanClassLoader with forbidden-apis instead [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14811:
URL: https://github.com/apache/lucene/pull/14811#issuecomment-2985810339

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] error-prone: implement BanClassLoader with forbidden-apis instead [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14811:
URL: https://github.com/apache/lucene/pull/14811#discussion_r2155576445


##
gradle/validation/forbidden-apis/defaults.all.txt:
##
@@ -76,3 +76,15 @@ java.lang.Math#fma(float,float,float)
 java.lang.Math#fma(double,double,double)
 
 java.lang.Thread#sleep(**) @ Thread.sleep makes inefficient use of resources, 
introduces weird race conditions and slows down the code/tests. Not a scalable 
and good practice so we should prevent it creeping into lucene code
+
+@defaultMessage Using dangerous ClassLoader APIs may deserialize untrusted 
user input into bytecode, leading to remote code execution vulnerabilities
+java.lang.ClassLoader#defineClass(**)
+jdk.internal.misc.Unsafe#defineClass(**)
+jdk.internal.access.JavaLangAccess#defineClass(**)
+# forbidden complains it can't find this enclosed class
+# java.lang.invoke.MethodHandles.Lookup#defineClass(**)

Review Comment:
   @uschindler if you have any ideas, i'd love to fix this. This is just using 
the same banned list that error-prone uses, except as forbidden signatures.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] error-prone: implement BanClassLoader with forbidden-apis instead [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14811:
URL: https://github.com/apache/lucene/pull/14811#issuecomment-2985834553

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] error-prone: implement BanClassLoader with forbidden-apis instead [lucene]

2025-06-18 Thread via GitHub


rmuir commented on PR #14811:
URL: https://github.com/apache/lucene/pull/14811#issuecomment-2985845882

   Source of the check: 
https://github.com/google/error-prone/blob/master/core/src/main/java/com/google/errorprone/bugpatterns/BanClassLoader.java
   
   It also checks for explicit `extends URLClassLoader`, which isn't handled 
here. I'm not sure what special powers subclassing gives, maybe more methods 
need to be banned.
   
   Also maybe some of these don't need to be banned because they are impossible 
due to java module system? I think google is still on java 8 :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add support for running ast-grep rules [lucene]

2025-06-18 Thread via GitHub


dweiss merged PR #14808:
URL: https://github.com/apache/lucene/pull/14808


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add support for running ast-grep rules [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14808:
URL: https://github.com/apache/lucene/pull/14808#issuecomment-2985638437

   Sorry for not waiting, I wanted to proceed. Please open a new issue with 
those rules. This one looks very neat indeed - we can even force it to run 
prior to javac. It's a C based parser for java... remember jikes? :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [I] Fail spotless check for wildcard imports [lucene]

2025-06-18 Thread via GitHub


dweiss closed issue #14553: Fail spotless check for wildcard imports
URL: https://github.com/apache/lucene/issues/14553


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss merged PR #14804:
URL: https://github.com/apache/lucene/pull/14804


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155467101


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   Thanks for coming up with this great idea - it seems like it opens up a lot 
of new possibilities. Definitely better than the custom formatter step I 
initially added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155464257


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   let's just commit yours, i'll try to see if i can find some easy-wins in the 
errorprone rule list and organize while I'm there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] Correct rat source patterns to include build-infra sources [lucene]

2025-06-18 Thread via GitHub


dweiss opened a new pull request, #14809:
URL: https://github.com/apache/lucene/pull/14809

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985545202

   > We should be able to make it easier for IntelliJ to do the right thing 
without too much trouble. One option is to use `gradle-idea-ext`, 
   
   I've never used this plugin - have it on my "to-check" list (!). It would be 
awesome to provide more reasonable IntelliJ defaults. My experience with using 
google-java-format plugin is that once you enable it, the import section 
formatting settings are completely ignored... 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Add support for running ast-grep rules [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14808:
URL: https://github.com/apache/lucene/pull/14808#issuecomment-2985547456

   I'll allow myself to merge this in without a review. I don't think anybody 
will object (but if anybody does, shout out).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155375027


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,15 @@
+id: wildcard-import-not-allowed

Review Comment:
   ```suggestion
   # yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
   id: wildcard-import-not-allowed
   ```
   
   I add the comment above each rule, users with 
https://github.com/redhat-developer/yaml-language-server get autocomplete / 
validation / hover documentation in editors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


msfroh commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985537756

   > I don't know how intellij behaves here or if we can prevent similar 
behavior. All this stuff can be followups, really, I'm just brainstorming.
   
   We should be able to make it easier for IntelliJ to do the right thing 
without too much trouble. One option is to use `gradle-idea-ext`, which is what 
we do on OpenSearch 
([ref](https://github.com/opensearch-project/OpenSearch/blob/6ff44d9b5e1f1f6f22b9935d21e408957383e940/gradle/ide.gradle#L67)).
 Another option is to use a `.editorconfig` file, where there are some 
IntelliJ-specific extensions (like 
`ij_java_names_count_to_use_import_on_demand`) that can crank up the threshold 
for wildcard imports.
   
   I can take a shot at a separate PR to help out there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on PR #14804:
URL: https://github.com/apache/lucene/pull/14804#issuecomment-2985418128

   Well, if this patch goes in, all it takes to add a rule is to create a new 
file in gradle/validation/ast-grep - it should be picked up automatically. We 
can also make ast-grep a requirement (much like python, perl and git) in the 
long term if it proves to be beneficial.
   
   I've taken a look at the tree sitter too, very interesting stuff. Gotta love 
how with all the cool languages out there, it's still C that prevails. :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[PR] build: replace invalidJavaOnlyPatterns source-patterns with ast-grep [lucene]

2025-06-18 Thread via GitHub


rmuir opened a new pull request, #14812:
URL: https://github.com/apache/lucene/pull/14812

   The validation.source-patterns.gradle is intimidating, there's a lot going 
on here. Many of the rules are doing regex matches, which is tricky to maintain 
and can't give good error messages, autofixes in IDE, relevant 
explanations/URLs, etc.
   
   Start with the 'invalidJavaOnlyPatterns' because it is one of many special 
cases, and better to just implement as ast-grep java rules.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] build: replace invalidJavaOnlyPatterns source-patterns with ast-grep [lucene]

2025-06-18 Thread via GitHub


github-actions[bot] commented on PR #14812:
URL: https://github.com/apache/lucene/pull/14812#issuecomment-2986028296

   This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. 
If the PR doesn't need a changelog entry, then add the skip-changelog label to 
it and you will stop receiving this reminder on future updates to the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] build: replace invalidJavaOnlyPatterns source-patterns with ast-grep [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14812:
URL: https://github.com/apache/lucene/pull/14812#discussion_r2155783331


##
gradle/validation/ast-grep/rules/java-patterns.yml:
##
@@ -0,0 +1,24 @@
+# Banned Lucene source patterns
+# Historically implemented as regexes which are more difficult
+---
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: java-lang-import
+language: java
+rule:
+  pattern: import java.lang.$REF
+  kind: import_declaration
+fix: ""

Review Comment:
   example of rule with a "fix". If it is safe and easy, it can make things 
nice, especially if rule is on the nitpicky side.
   * in editor it will show up as light-bulb, user can choose fix as a code 
action.
   * `scan -U` could be integrated if we wanted for some reason. though i'd 
recommend  "tidy" after any such thing.
   * error messages improve, the problem+fix displays as diff, which helps 
clarify the issue:
   ```
   > Task :applyAstGrepRules
   lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
   error[java-lang-import]: unnecessary import of `String` from java.lang
   @@ -20,7 +20,7 @@
   21 21│ 
   22 22│ import java.io.Closeable;
   23 23│ import java.io.IOException;
   24   │-import java.lang.String;
  24│+
   25 25│ import java.time.Instant;
   26 26│ import java.util.ArrayDeque;
   27 27│ import java.util.ArrayList;
   Note:
   classes in java.lang are implicitly imported
   
   Error: 1 error(s) found in code.
   Help: Scan succeeded and found error level diagnostics in the codebase.
   
   > Task :applyAstGrepRules FAILED
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155455149


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   Up to you, entirely. I just pasted... something. :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155456818


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


dweiss commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155458113


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   Oh, if you'd like to rename the rules - feel free to just commit whatever it 
is your gut tells you. I think I'm done here - once you think it's ready, just 
squash to main, please. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] detect and ban wildcard imports in Java [lucene]

2025-06-18 Thread via GitHub


rmuir commented on code in PR #14804:
URL: https://github.com/apache/lucene/pull/14804#discussion_r2155458055


##
gradle/validation/ast-grep/rules/no-wildcard-imports.sg.yaml:
##
@@ -0,0 +1,16 @@
+# yaml-language-server: 
$schema=https://raw.githubusercontent.com/ast-grep/ast-grep/refs/heads/main/schemas/java_rule.json
+id: wildcard-import-not-allowed
+language: java
+rule:
+  kind: asterisk
+  inside:
+kind: import_declaration
+severity: error
+message: don't use wildcard imports
+note: please use full imports instead
+url: 
https://google.github.io/styleguide/javaguide.html?cl=head#s3.3.1-wildcard-imports

Review Comment:
   we can always change it. that particular description is really cool too, 
because of how meta this PR is. By banning these wildcards, we can do more with 
these simple rules :)
   
   Just as an example, you could write unused import rule which would work 
correctly (if you wanted to), whereas with wildcards its not reasonable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



Re: [PR] Utilize `docIdRunEnd` on `ReqExclBulkScorer` [lucene]

2025-06-18 Thread via GitHub


gf2121 commented on code in PR #14806:
URL: https://github.com/apache/lucene/pull/14806#discussion_r215686


##
lucene/core/src/java/org/apache/lucene/search/ReqExclBulkScorer.java:
##
@@ -57,7 +57,10 @@ public int score(LeafCollector collector, Bits acceptDocs, 
int min, int max) thr
 exclDoc = exclApproximation.advance(upTo);
   }
   if (exclDoc == upTo) {
-if (exclTwoPhase == null || exclTwoPhase.matches()) {
+if (exclTwoPhase == null) {
+  // from upTo to docIdRunEnd() are excluded, so we scored up to 
docIdRunEnd()
+  upTo = exclApproximation.docIDRunEnd();

Review Comment:
   Oh yes, You are right. Thanks for explanation!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org