Hi, We are facing a high incoming rate of usually small documents (logs). The incoming rate is initially assumed at 2K/sec but could reach as high as 20K/sec. So a year's worth of data could reach 60G (assuming the rate at 2K/sec) searchable documents.
Since a single shard can contain no more than 2G documents, we will need at least 30 shards per year. Considering that we don't want to have shards to their maximum capacity, the shards we need will be considerably higher. My question is whether there is a hard (not possible) or soft (bad performance) limit on the number of shards per SolrCloud. ZooKeeper defaults file size to 1M, so I guess that causes some limit. If I set the value to a larger number, will SolrCloud really scales OK if there thousands of shards? Or I would be better off using multiple SolrCloud to handle the data (Result aggregation is done outside of SolrCloud)? Thanks, Zhifeng