How many documents was in that 20GB index?
I'm skeptical that a 1 billion document shard "won't be a problem." I mean
technically it is possible, but as you are already experiencing, it may take
a long time and a very powerful machine to do so. 100 million (or 250
million max) would be a more realistic goal. Even then, it depends on your
doc size and machine size.
The main point from the previous discussion is that although the technical
hard limit for a Solr shard is 2G docs, from a practical perspective it is
very difficult to get to that limit, not that indexing 1 billion docs on a
single shard is "just fine"!
As a general rule, if you want fast queries for high volume, strive to
assure that your per-shard index fits entirely into the system memory
available for OS caching of file system pages.
In any case, a proof of concept implementation will tell you everything you
need to know.
-- Jack Krupansky
-----Original Message-----
From: Vineet Mishra
Sent: Wednesday, June 4, 2014 2:45 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr maximum Optimal Index Size per Shard
Thanks all for your response.
I presume this conversation concludes that indexing around 1Billion
documents per shard won't be a problem, as I have 10 Billion docs to index,
so approx 10 shards with 1 Billion each should be fine with it and how
about Memory, what size of RAM should be fine for this amount of data?
Moreover what should be the indexing technique for this huge data set, as
currently I am indexing with EmbeddedSolrServer but its going pathetically
slow after some 20Gb of indexing. Comparatively SolrHttpPost was slow due
to network delays and response but after this long running the indexing
with EmbeddedSolrServer I am getting a different notion.
Any good indexing technique for this huge dataset would be highly
appreciated.
Thanks again!
On Wed, Jun 4, 2014 at 6:40 AM, rulinma <ruli...@gmail.com> wrote:
mark.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-maximum-Optimal-Index-Size-per-Shard-tp4139565p4139698.html
Sent from the Solr - User mailing list archive at Nabble.com.