Re: Improving Solr performance

Jonathan Rochkind Mon, 10 Jan 2011 13:08:31 -0800

I see a lot of people using shards to hold "different types ofdocuments", and it almost always seems to be a bad solution. Shards areintended for distributing a large index over multiple hosts -- that'sit. Not for some kind of federated search over multiple schemas, notfor access control.

Why not put everything in the same index, without shards, and just usean 'fq' limit in order to limit to the specific document you'd like tosearch over in a given search? I think that would achieve your goal alot more simply than shards -- then you use sharding only if and whenyour index grows to be so large you'd like to distribute it overmultiple hosts, and when you do so you choose a shard key that will havemore or less equal distribution accross shards.

Using shards for access control or schema management just leads toheadaches.

[Apparently Solr could use some highlighted documentation on what shardsare really for, as it seems to be a very common issue on this list,someone trying to use them for something else and then inevitablyfinding problems with that approach.]


Jonathan

On 1/7/2011 6:48 AM, supersoft wrote:

The reason of this distribution is the kind of the documents. In spite of
having the same schema structure (and solr conf), a document belongs to 1 of
5 different kinds.

Each kind corresponds to a concrete shard and due to this, the implemented
client tool avoids searching in all the shards when the users selects just
one or a few of kinds. The tool runs a multisharded query of the proper
shards. I guess this is a right approach but correct me if I am wrong.

The real problem of this architecture is the correlation between concurrent
users and response time:
1 query: n seconds
2 queries: 2*n second each query
3 queries: 3*n seconds each query
and so...

This is being a real headache because 1 single query has an acceptable
response time but when many users are accessing to the server the
performance goes hardly down.

Re: Improving Solr performance

Reply via email to