dweiss opened a new issue, #14611: URL: https://github.com/apache/lucene/issues/14611
### Description I believe this is caused by g1gc not able to keep up with garbage generated in this test. Here is an example analysis from a heap dump from a test run on policeman jenkins:  On a faster machine, I can reproduce it, sort-of by patching the test to increase the iteration count: ``` diff --git a/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java b/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java index b6503021617..b0711a39a87 100644 --- a/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java +++ b/lucene/core/src/test/org/apache/lucene/search/TestTermInSetQuery.java @@ -145,7 +145,10 @@ public class TestTermInSetQuery extends LuceneTestCase { continue; } - for (int i = 0; i < 100; ++i) { + for (int i = 0; i < 10000; ++i) { + if (((i + 1) % 100) == 0) { + // System.gc(); + } final float boost = random().nextFloat() * 10; final int numQueryTerms = TestUtil.nextInt(random(), 1, 1 << TestUtil.nextInt(random(), 1, 8)); ``` and then: ``` ./gradlew :lucene:core:test --tests "org.apache.lucene.search.TestTermInSetQuery.testDuel" -Ptests.jvms=5 "-Ptests.jvmargs=-XX:-UseCompressedOops -XX:+UseG1GC -verbose:gc -XX:ParallelGCThreads=1 -XX:ConcGCThreads=1" -Ptests.seed=B03A5F38917C1431 -Ptests.useSecurityManager=false -Ptests.gui=true -Ptests.file.encoding=UTF-8 -Ptests.vectorsize=128 -Ptests.forceintegervectors=true ``` which shows you gc activity that slowly saturates - ``` [1.990s][info][gc] GC(18) Pause Young (Normal) (G1 Evacuation Pause) 202M->18M(320M) 4.792ms [2.073s][info][gc] GC(19) Pause Young (Normal) (G1 Evacuation Pause) 202M->18M(320M) 5.012ms [2.120s][info][gc] GC(20) Pause Young (Normal) (G1 Evacuation Pause) 202M->15M(320M) 3.837ms [2.174s][info][gc] GC(21) Pause Young (Normal) (G1 Evacuation Pause) 202M->14M(320M) 4.191ms [2.220s][info][gc] GC(22) Pause Young (Normal) (G1 Evacuation Pause) 202M->12M(320M) 3.480ms [2.262s][info][gc] GC(23) Pause Young (Normal) (G1 Evacuation Pause) 202M->16M(320M) 5.352ms [2.308s][info][gc] GC(24) Pause Young (Normal) (G1 Evacuation Pause) 202M->18M(336M) 5.671ms [2.378s][info][gc] GC(25) Pause Young (Normal) (G1 Evacuation Pause) 211M->17M(336M) 6.082ms [2.423s][info][gc] GC(26) Pause Young (Normal) (G1 Evacuation Pause) 211M->14M(336M) 3.249ms [2.463s][info][gc] GC(27) Pause Young (Normal) (G1 Evacuation Pause) 211M->17M(336M) 4.955ms [2.521s][info][gc] GC(28) Pause Young (Normal) (G1 Evacuation Pause) 211M->15M(336M) 5.403ms [2.574s][info][gc] GC(29) Pause Young (Normal) (G1 Evacuation Pause) 211M->19M(350M) 6.991ms [2.637s][info][gc] GC(30) Pause Young (Normal) (G1 Evacuation Pause) 220M->15M(350M) 5.926ms [2.685s][info][gc] GC(31) Pause Young (Normal) (G1 Evacuation Pause) 220M->18M(350M) 6.019ms [2.731s][info][gc] GC(32) Pause Young (Normal) (G1 Evacuation Pause) 220M->18M(350M) 6.857ms [2.772s][info][gc] GC(33) Pause Young (Normal) (G1 Evacuation Pause) 220M->16M(370M) 5.656ms ... [156.833s][info][gc] GC(1113) Pause Young (Normal) (G1 Evacuation Pause) 481M->192M(512M) 4.962ms [157.443s][info][gc] GC(1114) Pause Young (Normal) (G1 Evacuation Pause) 480M->192M(512M) 2.759ms [158.061s][info][gc] GC(1115) Pause Young (Normal) (G1 Evacuation Pause) 482M->192M(512M) 2.356ms [158.690s][info][gc] GC(1116) Pause Young (Normal) (G1 Evacuation Pause) 482M->193M(512M) 5.056ms [159.300s][info][gc] GC(1117) Pause Young (Normal) (G1 Evacuation Pause) 481M->193M(512M) 5.686ms [159.919s][info][gc] GC(1118) Pause Young (Normal) (G1 Evacuation Pause) 481M->191M(512M) 1.260ms [160.551s][info][gc] GC(1119) Pause Young (Normal) (G1 Evacuation Pause) 481M->193M(512M) 5.426ms [161.176s][info][gc] GC(1120) Pause Young (Normal) (G1 Evacuation Pause) 481M->192M(512M) 2.014ms ``` I think with a slower machine it'll eventually lead to an OOM. If an explicit gc is called during iteration, the memory never saturates. Also, serial gc shows similar behavior (young gen gc's only) but it then eventually kicks a full gc. ``` ./gradlew :lucene:core:test --tests "org.apache.lucene.search.TestTermInSetQuery.testDuel" -Ptests.jvms=5 "-Ptests.jvmargs=-XX:-UseCompressedOops -XX:+UseSerialGC -verbose:gc" -Ptests.seed=B03A5F38917C1431 -Ptests.useSecurityManager=false -Ptests.gui=true -Ptests.file.encoding=UTF-8 -Ptests.vectorsize=128 -Ptests.forceintegervectors=true -Ptests.minheapsize=128m -Ptests.heapsize=256m ``` I don't know if there is much we can do here. Calling an explicit full gc once in a while in that loop seems like a dumb thing to do. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org