dweiss opened a new issue, #14611:
URL: https://github.com/apache/lucene/issues/14611

   ### Description
   
   I believe this is caused by g1gc not able to keep up with garbage generated 
in this test. Here is an example analysis from a heap dump from a test run on 
policeman jenkins:
   
   
![Image](https://github.com/user-attachments/assets/f0875a42-a597-49b2-a33d-e2a81dd03070)
   
   On a faster machine, I can reproduce it, sort-of by patching the test to 
increase the iteration count:
   ```
   diff --git 
a/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java 
b/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java
   index b6503021617..b0711a39a87 100644
   --- a/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java
   +++ b/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java
   @@ -145,7 +145,10 @@ public class TestTermInSetQuery extends LuceneTestCase {
            continue;
          }
    
   -      for (int i = 0; i < 100; ++i) {
   +      for (int i = 0; i < 10000; ++i) {
   +        if (((i + 1) % 100) == 0) {
   +          // System.gc();
   +        }
            final float boost = random().nextFloat() * 10;
            final int numQueryTerms =
                TestUtil.nextInt(random(), 1, 1 << TestUtil.nextInt(random(), 
1, 8));
   ```
   and then:
   ```
   ./gradlew :lucene:core:test --tests 
"org.apache.lucene.search.TestTermInSetQuery.testDuel" -Ptests.jvms=5 
"-Ptests.jvmargs=-XX:-UseCompressedOops -XX:+UseG1GC -verbose:gc 
-XX:ParallelGCThreads=1 -XX:ConcGCThreads=1" -Ptests.seed=B03A5F38917C1431 
-Ptests.useSecurityManager=false -Ptests.gui=true -Ptests.file.encoding=UTF-8 
-Ptests.vectorsize=128 -Ptests.forceintegervectors=true
   ```
   which shows you gc activity that slowly saturates -
   ```
   [1.990s][info][gc] GC(18) Pause Young (Normal) (G1 Evacuation Pause) 
202M->18M(320M) 4.792ms
   [2.073s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation Pause) 
202M->18M(320M) 5.012ms
   [2.120s][info][gc] GC(20) Pause Young (Normal) (G1 Evacuation Pause) 
202M->15M(320M) 3.837ms
   [2.174s][info][gc] GC(21) Pause Young (Normal) (G1 Evacuation Pause) 
202M->14M(320M) 4.191ms
   [2.220s][info][gc] GC(22) Pause Young (Normal) (G1 Evacuation Pause) 
202M->12M(320M) 3.480ms
   [2.262s][info][gc] GC(23) Pause Young (Normal) (G1 Evacuation Pause) 
202M->16M(320M) 5.352ms
   [2.308s][info][gc] GC(24) Pause Young (Normal) (G1 Evacuation Pause) 
202M->18M(336M) 5.671ms
   [2.378s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 
211M->17M(336M) 6.082ms
   [2.423s][info][gc] GC(26) Pause Young (Normal) (G1 Evacuation Pause) 
211M->14M(336M) 3.249ms
   [2.463s][info][gc] GC(27) Pause Young (Normal) (G1 Evacuation Pause) 
211M->17M(336M) 4.955ms
   [2.521s][info][gc] GC(28) Pause Young (Normal) (G1 Evacuation Pause) 
211M->15M(336M) 5.403ms
   [2.574s][info][gc] GC(29) Pause Young (Normal) (G1 Evacuation Pause) 
211M->19M(350M) 6.991ms
   [2.637s][info][gc] GC(30) Pause Young (Normal) (G1 Evacuation Pause) 
220M->15M(350M) 5.926ms
   [2.685s][info][gc] GC(31) Pause Young (Normal) (G1 Evacuation Pause) 
220M->18M(350M) 6.019ms
   [2.731s][info][gc] GC(32) Pause Young (Normal) (G1 Evacuation Pause) 
220M->18M(350M) 6.857ms
   [2.772s][info][gc] GC(33) Pause Young (Normal) (G1 Evacuation Pause) 
220M->16M(370M) 5.656ms
   ...
   [156.833s][info][gc] GC(1113) Pause Young (Normal) (G1 Evacuation Pause) 
481M->192M(512M) 4.962ms
   [157.443s][info][gc] GC(1114) Pause Young (Normal) (G1 Evacuation Pause) 
480M->192M(512M) 2.759ms
   [158.061s][info][gc] GC(1115) Pause Young (Normal) (G1 Evacuation Pause) 
482M->192M(512M) 2.356ms
   [158.690s][info][gc] GC(1116) Pause Young (Normal) (G1 Evacuation Pause) 
482M->193M(512M) 5.056ms
   [159.300s][info][gc] GC(1117) Pause Young (Normal) (G1 Evacuation Pause) 
481M->193M(512M) 5.686ms
   [159.919s][info][gc] GC(1118) Pause Young (Normal) (G1 Evacuation Pause) 
481M->191M(512M) 1.260ms
   [160.551s][info][gc] GC(1119) Pause Young (Normal) (G1 Evacuation Pause) 
481M->193M(512M) 5.426ms
   [161.176s][info][gc] GC(1120) Pause Young (Normal) (G1 Evacuation Pause) 
481M->192M(512M) 2.014ms
   ```
   I think with a slower machine it'll eventually lead to an OOM.
   
   If an explicit gc is called during iteration, the memory never saturates. 
Also, serial gc shows similar behavior (young gen gc's only) but it then 
eventually kicks a full gc.
   ```
   ./gradlew :lucene:core:test --tests 
"org.apache.lucene.search.TestTermInSetQuery.testDuel" -Ptests.jvms=5 
"-Ptests.jvmargs=-XX:-UseCompressedOops -XX:+UseSerialGC -verbose:gc" 
-Ptests.seed=B03A5F38917C1431 -Ptests.useSecurityManager=false -Ptests.gui=true 
-Ptests.file.encoding=UTF-8 -Ptests.vectorsize=128 
-Ptests.forceintegervectors=true -Ptests.minheapsize=128m -Ptests.heapsize=256m
   ```
   
   I don't know if there is much we can do here. Calling an explicit full gc 
once in a while in that loop seems like a dumb thing to do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to