[ https://issues.apache.org/jira/browse/LUCENE-9189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025992#comment-17025992 ]
ASF subversion and git services commented on LUCENE-9189: --------------------------------------------------------- Commit 4773574578f089802fe3f36bff6951c4a29a3628 in lucene-solr's branch refs/heads/gradle-master from Robert Muir [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4773574 ] LUCENE-9189: TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes The issue is that MockDirectoryWrapper's disk full check is horribly inefficient. On every writeByte/etc, it totally recomputes disk space across all files. This means it calls listAll() on the underlying Directory (which sorts all the underlying files), then sums up fileLength() for each of those files. This leads to many pathological cases in the disk full tests... but the number of tests impacted by this is minimal, and the logic is scary. > TestIndexWriterDelete.testDeletesOnDiskFull can run for minutes > --------------------------------------------------------------- > > Key: LUCENE-9189 > URL: https://issues.apache.org/jira/browse/LUCENE-9189 > Project: Lucene - Core > Issue Type: Task > Reporter: Robert Muir > Priority: Major > Fix For: master (9.0) > > > I thought it was just the testUpdatesOnDiskFull, but looks like this one > needs to be nightly too. > Should look more into the test, but I know something causes it to make such > an insane amount of files, that sorting them becomes a bottleneck. > I guess also related is that it would be great if MockDirectoryWrapper's disk > full check didn't trigger a sort of the files (via listAll): it does this > check on like every i/o, would be nice for it to be less absurd. Maybe > instead the test could check for disk full on not every i/o but some random > sample of them? > Temporarily lets make it nightly... > {noformat} > PROFILE SUMMARY from 182501 samples > tests.profile.count=10 > tests.profile.stacksize=1 > tests.profile.linenumbers=false > PERCENT SAMPLES STACK > 15.89% 28995 java.lang.StringLatin1#compareTo() > 6.61% 12069 java.util.TimSort#mergeHi() > 5.96% 10878 java.util.TimSort#binarySort() > 3.41% 6231 java.util.concurrent.ConcurrentHashMap#tabAt() > 2.98% 5433 java.util.Comparators$NaturalOrderComparator#compare() > 2.12% 3876 org.apache.lucene.store.DataOutput#copyBytes() > 2.03% 3712 java.lang.String#compareTo() > 1.84% 3350 java.util.concurrent.ConcurrentHashMap#get() > 1.83% 3337 java.util.TimSort#mergeLo() > 1.67% 3047 java.util.ArrayList#add() > {noformat} > All the file sorting is called from stacks like this, so its literally > happening every writeByte() and so on > {noformat} > 0.73% 1329 java.util.TimSort#binarySort() > at java.util.TimSort#sort() > at java.util.Arrays#sort() > at java.util.ArrayList#sort() > at java.util.stream.SortedOps$RefSortingSink#end() > at java.util.stream.AbstractPipeline#copyInto() > at java.util.stream.AbstractPipeline#wrapAndCopyInto() > at java.util.stream.AbstractPipeline#evaluate() > at > java.util.stream.AbstractPipeline#evaluateToArrayNode() > at java.util.stream.ReferencePipeline#toArray() > at > org.apache.lucene.store.ByteBuffersDirectory#listAll() > at > org.apache.lucene.store.MockDirectoryWrapper#sizeInBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#checkDiskFull() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeBytes() > at > org.apache.lucene.store.MockIndexOutputWrapper#writeByte() > at org.apache.lucene.store.DataOutput#writeInt() > at org.apache.lucene.codecs.CodecUtil#writeFooter() > at > org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat#writeLiveDocs() > at > org.apache.lucene.index.PendingDeletes#writeLiveDocs() > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org