Hi, We are performing QA performance testing on couple of collections which holds 2 billion and 3.5 billion docs respectively. Indexing happens from a separate client using solrJ which uses 10 thread and batch size 1000. From last 2-3 weeks we have been noticing either slow indexing or timeout errors while indexing. As part of troubleshooting, we did noticed that when peak disk IO utilization is reaching higher side, then indexing is happening slowly and when disk IO is constantly near 100%, timeout issues are observed.
Few questions here: 1. Our performance team noticed that read operations are pretty more than write operations like 100:1 ratio, is this expected during indexing or solr nodes are doing any other operations like syncing? 2. Zookeeper has a latency around (min/avg/max: 0/0/2205), can this latency create instabilities issues to ZK or Solr clusters? Or impact indexing or searching operations? 3. Our client timeout is set to 2mins, can they increase further more? Would that help or create any other problems? 4. When we created an empty collection and loaded same data file, it loaded fine without any issues so having more documents in a collection would create such problems? Any suggestions or feedback would be really appreciated. Solr version - 7.7.1 Time out error snippet: ERROR (updateExecutor-3-thread-30055-processing-x:TestCollection_shard5_replica_n18 https:////localhost:1122//solr//TestCollection_shard6_replica_n22 r:core_node21 n:localhost:1122_solr c:TestCollection s:shard5) [c:TestCollection s:shard5 r:core_node21 x:TestCollection_shard5_replica_n18] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_212] at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_212] at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_212] at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_212] at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) ~[?:1.8.0_212] at sun.security.ssl.InputRecord.read(InputRecord.java:503) ~[?:1.8.0_212] at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975) ~[?:1.8.0_212] at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933) ~[?:1.8.0_212] at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) ~[?:1.8.0_212] at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) ~[httpcore-4.4.10.jar:4.4.10] at org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:120) ~[solr-core-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:07] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[httpclient-4.5.6.jar:4.5.6] at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:349) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:183) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) ~[metrics-core-3.2.6.jar:3.2.6] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$303(ExecutorUtil.java) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212] 2020-07-01 21:02:58.301 ERROR (updateExecutor-3-thread-30033-processing-x:TestCollection_shard5_replica_n18 https:////localhost:1122//solr//TestCollection_shard5_replica_n6 r:core_node21 n:localhost:1122_solr c:TestCollection s:shard5) [c:TestCollection s:shard5 r:core_node21 x:TestCollection_shard5_replica_n18] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at https://localhost:1122/solr/TestCollection_shard5_replica_n6: Server Error request: https://localhost:1122/solr/TestCollection_shard5_replica_n6/update?update.chain=uuid&update.distrib=FROMLEADER&distrib.from=https%3A%2F%2Flocalhost%3A1122%2Fsolr%2FTestCollection_shard5_replica_n18%2F&wt=javabin&version=2 Remote error message: java.util.concurrent.TimeoutException: Idle timeout expired: 600000/600000 ms at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:385) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:183) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) ~[metrics-core-3.2.6.jar:3.2.6] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.dt_access$303(ExecutorUtil.java) ~[solr-solrj-7.7.1.jar:7.7.1 5bf96d32f88eb8a2f5e775339885cd6ba84a3b58 - ishan - 2019-02-23 02:39:09] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212] 2020-07-01 21:02:58.322 ERROR (updateExecutor-3-thread-30027-processing-x:TestCollection_shard5_replica_n18 https:////localhost:1122//solr//TestCollection_shard3_replica_n14 r:core_node21 n:localhost:1122_solr c:TestCollection s:shard5) [c:TestCollection s:shard5 r:core_node21 x:TestCollection_shard5_replica_n18] o.a.s.u.ErrorReportingConcurrentUpdateSolrClient error org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at https://localhost:1122/solr/TestCollection_shard3_replica_n14: Server Error Thanks & Regards, Vinodh DTCC DISCLAIMER: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error, please notify us immediately and delete the email and any attachments from your system. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.