On Wed, 2014-03-19 at 11:55 +0100, Colin R wrote:
> We run a central database of 14M (and growing) photos with dates, captions,
> keywords, etc. 
> 
> We currently upgrading from old Lucene Servers to latest Solr running with a
> couple of dedicated  servers (6 core, 36GB, 500SSD). Planning on using Solr
> Cloud.
What hardware are your past experiences based on? If they have less
cores, lower memory and spinning drives, I foresee that your question
can be reduced to which architecture you prefer from a logistic point of
view, rather than performance.

> We take in thousands of changes each day (big and small) so indexing may be
> a bigger problem than searching.

Thousands of updates in a day is a very low number. Do you have hard
requirements for update time, perform heavy faceting or do anything
special for this to be a cause of concern?

> Is it quicker/better to just have one big 14M index and filter the
> complexities for each website or is it better to still maintain hundreds of
> indexes so we are searching smaller one.

All else being equal, a search in a specific small index will be faster
than filtering on the large one. But as we know, all else is never
equal. A 14M document index in itself is not really a challenge for
Lucene/Solr, but this depends a lot on your specific setup. How large is
the 14M index in terms of bytes?

> Bear in mind, we get thousands of changes a day PLUS very busy search servers.

How many queries/second are we talking about here? What is a typical
query (faceting, grouping, special processing...)?

Regards,
Toke Eskildsen, State and University Library, Denmark


Reply via email to