[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248237#comment-17248237 ]
Thomas Wöckinger commented on SOLR-14923: ----------------------------------------- [~dsmiley] I have run some performance tests, results are very promising, when indexing (only adding new documents) with 16 threads, 14 to 15 threads are fully utilized. The results are the same as without nested documents. I also have done some profiling using JMC, no contention (as expected) from DistributedUpdateProcessor. There is still heavy contention on the UpdateLog.add() method, but this will be hard work to optimize. Maybe it would be better to remove this part if RTG is not used that much, but that's another story. I hope you have time to review soon. Thx in advance. > Indexing performance is unacceptable when child documents are involved > ---------------------------------------------------------------------- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors > Affects Versions: 8.3, 8.4, 8.5, 8.6, master (9.0) > Reporter: Thomas Wöckinger > Priority: Critical > Labels: performance, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org