[ https://issues.apache.org/jira/browse/LUCENE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192398#comment-17192398 ]
Michael McCandless commented on LUCENE-9511: -------------------------------------------- {quote}The reason are very frequent flushes due to small maxBufferedDocs which causes 300+ DWPTs to be blocked for flush causing ultimately an OOM exception. {quote} Wild :) Go randomized testing! I guess this would mean applications using more threads to index would have seen RAM consumed but not tracked properly by {{IndexWriter}} until this fix. > Include StoredFieldsWriter in DWPT accounting > --------------------------------------------- > > Key: LUCENE-9511 > URL: https://issues.apache.org/jira/browse/LUCENE-9511 > Project: Lucene - Core > Issue Type: Bug > Reporter: Simon Willnauer > Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > StoredFieldsWriter might consume some heap space memory that can have a > significant impact on decisions made in the IW if writers should be stalled > or DWPTs should be flushed if memory settings are small in IWC and flushes > are frequent. We should add some accounting to the StoredFieldsWriter since > it's part of the DWPT lifecycle and not just present during flush. > Our nightly builds ran into some OOMs due to the large chunk size used in the > CompressedStoredFieldsFormat. The reason are very frequent flushes due to > small maxBufferedDocs which causes 300+ DWPTs to be blocked for flush causing > ultimately an OOM exception. > {noformat} > > NOTE: reproduce with: ant test -Dtestcase=TestIndexingSequenceNumbers > -Dtests.method=testStressConcurrentCommit -Dtests.seed=A04943A98C8E2954 > -Dtests.nightly=true -Dtests.slow=true -Dtests.badapples=true > -Dtests.locale=vo-001 -Dtests.timezone=Africa/Ouagadougou > -Dtests.asserts=true -Dtests.file.encoding=UTF8*06:06:15* [junit4] ERROR > 107s J3 | TestIndexingSequenceNumbers.testStressConcurrentCommit > <<<*06:06:15* [junit4] > Throwable #1: > org.apache.lucene.store.AlreadyClosedException: this IndexWriter is > closed*06:06:15* [junit4] > at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:876)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:890)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3727)*06:06:15* > [junit4] > at > org.apache.lucene.index.TestIndexingSequenceNumbers.testStressConcurrentCommit(TestIndexingSequenceNumbers.java:228)*06:06:15* > [junit4] > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method)*06:06:15* [junit4] > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)*06:06:15* > [junit4] > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)*06:06:15* > [junit4] > at > java.base/java.lang.reflect.Method.invoke(Method.java:566)*06:06:15* > [junit4] > at > java.base/java.lang.Thread.run(Thread.java:834)*06:06:15* [junit4] > > Caused by: java.lang.OutOfMemoryError: Java heap space*06:06:15* [junit4] > > at > __randomizedtesting.SeedInfo.seed([A04943A98C8E2954]:0)*06:06:15* [junit4] > > at > org.apache.lucene.store.GrowableByteArrayDataOutput.<init>(GrowableByteArrayDataOutput.java:46)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:111)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.asserting.AssertingStoredFieldsFormat.fieldsWriter(AssertingStoredFieldsFormat.java:48)*06:06:15* > [junit4] > at > org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(StoredFieldsConsumer.java:39)*06:06:15* > [junit4] > at > org.apache.lucene.index.StoredFieldsConsumer.startDocument(StoredFieldsConsumer.java:46)*06:06:15* > [junit4] > at > org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:426)*06:06:15* > [junit4] > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:462)*06:06:15* > [junit4] > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:233)*06:06:15* > [junit4] > at > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:419)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1470)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1463)*06:06:15* > [junit4] > at > org.apache.lucene.index.TestIndexingSequenceNumbers$2.run(TestIndexingSequenceNumbers.java:206)Throwable > #2: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=1140, name=Thread-1095, > state=RUNNABLE, group=TGRP-TestIndexingSequenceNumbers]*06:06:15* [junit4] > > Caused by: java.lang.OutOfMemoryError: Java heap spaceThrowable #3: > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=1139, name=Thread-1094, > state=RUNNABLE, group=TGRP-TestIndexingSequenceNumbers]*06:06:15* [junit4] > > Caused by: java.lang.OutOfMemoryError: Java heap space*06:06:15* > [junit4] > at > __randomizedtesting.SeedInfo.seed([A04943A98C8E2954]:0)*06:06:15* [junit4] > > at > org.apache.lucene.store.GrowableByteArrayDataOutput.<init>(GrowableByteArrayDataOutput.java:46)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:111)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.asserting.AssertingStoredFieldsFormat.fieldsWriter(AssertingStoredFieldsFormat.java:48)*06:06:15* > [junit4] > at > org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(StoredFieldsConsumer.java:39)*06:06:15* > [junit4] > at > org.apache.lucene.index.StoredFieldsConsumer.startDocument(StoredFieldsConsumer.java:46)*06:06:15* > [junit4] > at > org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:426)*06:06:15* > [junit4] > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:462)*06:06:15* > [junit4] > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:233)*06:06:15* > [junit4] > at > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:419)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1470)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1463)*06:06:15* > [junit4] > at > org.apache.lucene.index.TestIndexingSequenceNumbers$2.run(TestIndexingSequenceNumbers.java:206)Throwable > #4: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=1137, name=Thread-1092, > state=RUNNABLE, group=TGRP-TestIndexingSequenceNumbers]*06:06:15* [junit4] > > Caused by: java.lang.OutOfMemoryError: Java heap space*06:06:15* > [junit4] > at > __randomizedtesting.SeedInfo.seed([A04943A98C8E2954]:0)*06:06:15* [junit4] > > at > org.apache.lucene.store.GrowableByteArrayDataOutput.<init>(GrowableByteArrayDataOutput.java:46)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:111)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)*06:06:15* > [junit4] > at > org.apache.lucene.codecs.asserting.AssertingStoredFieldsFormat.fieldsWriter(AssertingStoredFieldsFormat.java:48)*06:06:15* > [junit4] > at > org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(StoredFieldsConsumer.java:39)*06:06:15* > [junit4] > at > org.apache.lucene.index.StoredFieldsConsumer.startDocument(StoredFieldsConsumer.java:46)*06:06:15* > [junit4] > at > org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:426)*06:06:15* > [junit4] > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:462)*06:06:15* > [junit4] > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:233)*06:06:15* > [junit4] > at > org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:419)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1470)*06:06:15* > [junit4] > at > org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1463)*06:06:15* > [junit4] > at > org.apache.lucene.index.TestIndexingSequenceNumbers$2.run(TestIndexingSequenceNumbers.java:206)Throwable > #5: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=1141, name=Thread-1096, > state=RUNNABLE, group=TGRP-TestIndexingSequenceNumbers]*06:06:15* [junit4] > > Caused by: java.lang.OutOfMemoryError: Java heap spaceThrowable #6: > com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an > uncaught exception in thread: Thread[id=1138, name=Thread-1093, > state=RUNNABLE, group=TGRP-TestIndexingSequenceNumbers]*06:06:15* [junit4] > > Caused by: java.lang.OutOfMemoryError: Java heap space*06:06:15* > [junit4] 2> NOTE: leaving temporary files on disk at: > /var/lib/jenkins/workspace/apache+lucene-solr+nightly+branch_8x/lucene/build/core/test/J3/temp/lucene.index.TestIndexingSequenceNumbers_A04943A98C8E2954-001*06:06:15* > [junit4] 2> NOTE: test params are: codec=Asserting(Lucene87): > \{id=PostingsFormat(name=LuceneVarGapDocFreqInterval)}, > docValues:\{thread=DocValuesFormat(name=Lucene80), > ___soft_deletes=DocValuesFormat(name=Asserting)}, maxPointsInLeafNode=837, > maxMBSortInHeap=5.048657222730272, > sim=Asserting(RandomSimilarity(queryNorm=true): \{id=IB SPL-D2}), > locale=vo-001, timezone=Africa/Ouagadougou*06:06:15* [junit4] 2> NOTE: > Linux 4.12.14-122.32-default amd64/Oracle Corporation 11.0.2 > (64-bit)/cpus=32,threads=1,free=444391792,total=536870912*06:06:15* > [junit4] 2> NOTE: All tests run in this JVM: > [TestApproximationSearchEquivalence, TestPositionIncrement, > TestCachingTokenFilter, TestSimpleExplanationsOfNonMatches, > TestPostingsOffsets, TestTimSorter, TestSpanNearQuery, > TestConstantScoreScorer, TestSpanExplanations, TestIndexingSequenceNumbers] > > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org