On 4/22/2014 10:02 AM, yypvsxf19870706 wrote:
I am curious of the influences when have more than 2G docs in a core.And
we plan to have 5g docs/core.
Please give me some suggestion about how to plan num of docs in a core ?
One Solr core contains one Lucene index. It can't be divided further
than that without a significant redesign. Quick note: Although
SolrCloud can handle five billion documents with no problem, you can't
have five billion documents in a single shard/core.
The only hard limitation in the entire system is that you can't have
more than approximately 2 billion documents in a single Lucene index.
This is because a Java integer (which is a signed 32-bit number) is what
gets used for internal Lucene document identifiers. Deleted documents
count against that limit. It is theoretically possible to overcome this
limitation, but it would be a MAJOR change to Lucene, requiring major
changes in Solr as well.
The other limitations you can run into with a large SolrCloud are mostly
a matter of configuration, system resources, and scaling to multiple
servers. They are not hard limitations in the software.
I would never put more than about 1 billion documents in a single core.
For performance reasons, it would be a good idea to never exceed a few
hundred million. When a high query rate is required, loading only one
Solr core per server may be a requirement.
Thanks,
Shawn