On 1/7/2015 7:14 PM, Nishanth S wrote:
> Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads  for the
> moment would be in the 1000 reads/second. Guess finding out the right
> number  of  shards would be my starting point.

I don't think indexing 12000 docs per second would be too much for Solr
to handle, as long as you architect the indexing application properly.
You would likely need to have several indexing threads or processes that
index in parallel.  Solr is fully thread-safe and can handle several
indexing requests at the same time.  If the indexing application is
single-threaded, indexing speed will not reach its full potential.

Be aware that indexing at the same time as querying will reduce the
number of queries per second that you can handle.  In an environment
where both reads and writes are heavy like you have described, more
shards and/or more replicas might be required.

For the query side ... even 1000 queries per second is a fairly heavy
query rate.  You're likely to need at least a few replicas, possibly
several, to handle that.  The type and complexity of the queries you do
will make a big difference as well.  To handle that query level, I would
still recommend only running one shard replica on each server.  If you
have three shards and three replicas, that means 9 Solr servers.

How many documents will you have in total?  You said they are about 6KB
each ... but depending on the fieldType definitions (and the analysis
chain for TextField types), 6KB might be very large or fairly small.

Do you have any idea how large the Solr index will be with all your
documents?  Estimating that will require indexing a significant
percentage of your documents with the actual schema and config that you
will use in production.

If I know how many documents you have, how large the full index will be,
and can see an example of the more complex queries you will do, I can
make *preliminary* guesses about the number of shards you might need.  I
do have to warn you that it will only be a guess.  You'll have to
experiment to see what works best.

Thanks,
Shawn

Reply via email to