I think 100 million documents is a realistic number for a single shard.
Maybe 250 million depending on your data. But I would say that beyond that
is being unrealistic. In some cases, even 50 million might be too much for a
single shard, depending on the data and query usage. Sure, maybe depending
on your data 2 billion documents might work, but I wouldn't bet on it. And
even if you manage to index 500 million or more documents on a single shard,
memory and performance for production query loads would be questionable.
Query capacity also depends on things like number of faceted fields (i.e.,
the field cache), string field size, number of unique terms in each field,
solr query cache, and highlighting of large fields. Not to mention wanting
to have enough capacity so that the number of documents can grow over time.
As an experiment, index 250 million documents in one shard and see how
typical queries perform, and how much JVM memory you use and still have
available. Make sure to try quite a few queries (using a script), especially
if any fields are faceted or highlighted. Then you can decide whether you
feel comfortable trying a larger shard size or if a smaller size is needed.
-- Jack Krupansky
-----Original Message-----
From: tosenthu
Sent: Monday, May 28, 2012 1:25 PM
To: solr-user@lucene.apache.org
Subject: Re: Negative value in numFound
The RAM is about 14.5G. Allocated for Tomcat..
I have now 2 shards. But I was in an impression i can handle it with couple
of Shards. But in this case i need to have shards which can only grow up
2^31-1 records and many such shards to support 12 Billion records.
I will try to have more cores and distribute update between them. Then comes
my next question. Is there a possibility to restrict by any configuration
for a core to reject updates based on the number of records. And is there a
possibility to split a index into 2 or more based on a query.
Any how my network will have 2 SOLR servers to participate in indexing and
search.. Probably i need to have at least 6 cores distributed across these
machines to support 12 Billion Records. What is you say?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Negative-value-in-numFound-tp3986398p3986453.html
Sent from the Solr - User mailing list archive at Nabble.com.