The fact that all the shards have the same leader is somewhat of a red herring. Until you get hundreds of shards (perhaps across a _lot_ of collections), the additional load on the leaders is hard to measure. If you really see this as a problem, consider the BALANCESHARDUNIQUE and REBALANCELEADERS Collection API commands, see: https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-BalanceSliceUnique
That said, your OOM errors indicate you simply have too many Solr collections doing too many things with too little memory. bq: All other non-leader server are relying on Leader to finish the new document index. This is not the case. The indexing is done independently on all replicas. What's _probably_ happening here is that the leaders are spinning off threads to pass the data on to the replicas and you're running so close to the heap limit that spinning up those threads is pushing you to OOM errors. And, if my hypothesis is true you'll soon run into problems on the non-leaders as you index more and more documents to your collections. Consider some serious effort in terms of determining your hardware/JVM needs, see: https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Best, Erick