Hi 
    I am curious of the influences when have more than 2G docs in a core.And we 
plan to have  5g docs/core.

   Please give me some suggestion about how to plan num of docs in a core ?

    Thanks.

发自我的 iPhone

在 2014-4-22,12:30,Erick Erickson <erickerick...@gmail.com> 写道:

> You're going to run into resource issues long before you hit 2G
> docs/node I suspect. I've seen from 50M t0 300M docs on a single node.
> Fortunately you may well be near the upper end of that since you're
> dealing with log files.
> 
> Bottom line here is that you're off into largely uncharted territory
> when you start talking about hundreds of nodes. There's certainly work
> going on to make that work, but you'd be on the bleeding edge.
> 
> Best,
> Erick
> 
> On Mon, Apr 21, 2014 at 8:55 PM, Zhifeng Wang <zhifeng.wang...@gmail.com> 
> wrote:
>> Hi,
>> 
>> We are facing a high incoming rate of usually small documents (logs). The
>> incoming rate is initially assumed at 2K/sec but could reach as high as
>> 20K/sec. So a year's worth of data could reach 60G (assuming the rate at
>> 2K/sec) searchable documents.
>> 
>> Since a single shard can contain no more than 2G documents, we will need at
>> least 30 shards per year. Considering that we don't want to have shards to
>> their maximum capacity, the shards we need will be considerably higher.
>> 
>> My question is whether there is a hard (not possible) or soft (bad
>> performance) limit on the number of shards per SolrCloud. ZooKeeper
>> defaults file size to 1M, so I guess that causes some limit. If I set the
>> value to a larger number, will SolrCloud really scales OK if there
>> thousands of shards?  Or I would be better off using multiple SolrCloud to
>> handle the data (Result aggregation is done outside of SolrCloud)?
>> 
>> Thanks,
>> Zhifeng

Reply via email to