Hi,

We are facing a high incoming rate of usually small documents (logs). The
incoming rate is initially assumed at 2K/sec but could reach as high as
20K/sec. So a year's worth of data could reach 60G (assuming the rate at
2K/sec) searchable documents.

Since a single shard can contain no more than 2G documents, we will need at
least 30 shards per year. Considering that we don't want to have shards to
their maximum capacity, the shards we need will be considerably higher.

My question is whether there is a hard (not possible) or soft (bad
performance) limit on the number of shards per SolrCloud. ZooKeeper
defaults file size to 1M, so I guess that causes some limit. If I set the
value to a larger number, will SolrCloud really scales OK if there
thousands of shards?  Or I would be better off using multiple SolrCloud to
handle the data (Result aggregation is done outside of SolrCloud)?

Thanks,
Zhifeng

Reply via email to