Dears, Hi, I know that there are lots of tips about how to make the Solr indexing faster. Probably some of the most important ones which are considered in client side are choosing batch indexing and multi-thread indexing. There are other important factors that are server side which I dont want to mentioned here. Anyway my question would be is there any best practice for number of client threads and the size of batch available over WAN network? Since the client and servers are connected over WAN network probably some of the performance conditions such as network latency, bandwidth and etc. are different from LAN network. Another think that is matter for me is the fact that document sizes are might be different in diverse scenarios. For example when you want to index web-pages the size of document might be from 1KB to 200KB. In such case choosing batch size according to the number of documents is probably not the best way of optimizing index performance. Probably choosing based on the size of batch size in KB/MB would be better from the network point of view. However, from the Solr side document numbers matter. So if I want to summarize my questions here what am I looking for: 1- Is there any best practice available for Solr client side performance tuning over WAN network for the purpose of indexing/reindexing/updating? Does it different from LAN network? 2- Which one is matter: number of documents or the total size of documents in batch?
Best regards. -- A.Nazemian